

## Abstract

# Modular Machine Code Verification

Zhaozhong Ni

2007

Formally establishing safety properties of software presents a grand challenge to the computer science community. Producing proof-carrying code, *i.e.*, machine code with machine-checkable specifications and proofs, is particularly difficult for system softwares written in low-level languages. One central problem is the lack of verification theories that can handle the expressive power of low-level code in a modular fashion. In particular, traditional type- and logic-based verification approaches have restrictions on either expressive power or modularity.

This dissertation presents XCAP, a logic-based proof-carrying code framework for modular machine code verification. In XCAP, program specifications are written as general logic predicates, in which syntactic constructs are used to modularly specify some crucial higher-order programming concepts for system code, including embedded code pointers, impredicative polymorphisms, recursive invariants, and general references, all in a logical setting. Thus, XCAP achieves the expressive power of logic-based approaches and the modularity of type-based approaches. Its meta theory has been completely mechanized and proved.

XCAP can be used to directly certify system kernel code. This dissertation contains a mini certified thread library written in x86 assembly. Every single instruction in the library, including those for context switching and thread scheduling, has a formal XCAP specification and a proof. XCAP is also connected to existing certifying compiler; a type-preserving translation from a typed assembly language to XCAP is included.

| <b>Report Documentation Page</b>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                                    |                                                     | <i>Form Approved<br/>OMB No. 0704-0188</i>                |                                   |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------|-----------------------------------------------------|-----------------------------------------------------------|-----------------------------------|
| <p>Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number.</p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                                    |                                                     |                                                           |                                   |
| 1. REPORT DATE<br><b>MAY 2007</b>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 2. REPORT TYPE                     | 3. DATES COVERED<br><b>00-00-2007 to 00-00-2007</b> |                                                           |                                   |
| 4. TITLE AND SUBTITLE<br><b>Modular Machine Code Verification</b>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |                                    | 5a. CONTRACT NUMBER                                 |                                                           |                                   |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                    | 5b. GRANT NUMBER                                    |                                                           |                                   |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                    | 5c. PROGRAM ELEMENT NUMBER                          |                                                           |                                   |
| 6. AUTHOR(S)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                    | 5d. PROJECT NUMBER                                  |                                                           |                                   |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                    | 5e. TASK NUMBER                                     |                                                           |                                   |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                    | 5f. WORK UNIT NUMBER                                |                                                           |                                   |
| 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES)<br><b>Yale University ,Department of Computer Science,New Haven,CT,06520-8285</b>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |                                    | 8. PERFORMING ORGANIZATION REPORT NUMBER            |                                                           |                                   |
| 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |                                    | 10. SPONSOR/MONITOR'S ACRONYM(S)                    |                                                           |                                   |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                    | 11. SPONSOR/MONITOR'S REPORT NUMBER(S)              |                                                           |                                   |
| 12. DISTRIBUTION/AVAILABILITY STATEMENT<br><b>Approved for public release; distribution unlimited</b>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                    |                                                     |                                                           |                                   |
| 13. SUPPLEMENTARY NOTES                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |                                    |                                                     |                                                           |                                   |
| 14. ABSTRACT<br><b>Formally establishing safety properties of software presents a grand challenge to the computer science community. Producing proof-carrying code, i.e., machine code with machine-checkable specifications and proofs, is particularly difficult for system softwares written in low-level languages. One central problem is the lack of verification theories that can handle the expressive power of low-level code in a modular fashion. In particular, traditional type- and logic-based verification approaches have restrictions on either expressive power or modularity. This dissertation presents XCAP, a logic-based proof-carrying code framework for modular machine code verification. In XCAP, program specifications are written as general logic predicates, in which syntactic constructs are used to modularly specify some crucial higher-order programming concepts for system code, including embedded code pointers, impredicative polymorphisms, recursive invariants, and general references, all in a logical setting. Thus, XCAP achieves the expressive power of logic-based approaches and the modularity of type-based approaches. Its meta theory has been completely mechanized and proved. XCAP can be used to directly certify system kernel code. This dissertation contains a mini certified thread library written in x86 assembly. Every single instruction in the library, including those for context switching and thread scheduling, has a formal XCAP specification and a proof. XCAP is also connected to existing certifying compiler; a typepreserving translation from a typed assembly language to XCAP is included.</b> |                                    |                                                     |                                                           |                                   |
| 15. SUBJECT TERMS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |                                    |                                                     |                                                           |                                   |
| 16. SECURITY CLASSIFICATION OF:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                    |                                                     | 17. LIMITATION OF ABSTRACT<br><b>Same as Report (SAR)</b> | 18. NUMBER OF PAGES<br><b>183</b> |
| a. REPORT<br><b>unclassified</b>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | b. ABSTRACT<br><b>unclassified</b> | c. THIS PAGE<br><b>unclassified</b>                 | 19a. NAME OF RESPONSIBLE PERSON                           |                                   |

# **Modular Machine Code Verification**

A Dissertation  
Presented to the Faculty of the Graduate School  
of  
Yale University  
in Candidacy for the Degree of  
Doctor of Philosophy

By  
Zhaozhong Ni

Dissertation Director: Zhong Shao

May 2007



Copyright © 2007 by Zhaozhong Ni

All rights reserved.



# Contents

|                                                                        |            |
|------------------------------------------------------------------------|------------|
| <b>Acknowledgments</b>                                                 | <b>vii</b> |
| <b>1 Introduction</b>                                                  | <b>1</b>   |
| 1.1 Machine Code Verification . . . . .                                | 2          |
| 1.2 Previous Work . . . . .                                            | 4          |
| 1.3 Challenges and Contributions . . . . .                             | 6          |
| <b>2 Background</b>                                                    | <b>11</b>  |
| 2.1 The Target Machine . . . . .                                       | 11         |
| 2.2 Mechanized Meta Logic . . . . .                                    | 13         |
| 2.3 The Certified Assembly Programming Framework . . . . .             | 16         |
| <b>3 Embedded Code Pointers and the Basic XCAP Framework</b>           | <b>21</b>  |
| 3.1 Embedded Code Pointers . . . . .                                   | 22         |
| 3.2 Extended Propositions . . . . .                                    | 27         |
| 3.3 The XCAP Framework . . . . .                                       | 30         |
| 3.4 Discussion . . . . .                                               | 34         |
| <b>4 Impredicative Polymorphisms and Recursive Specifications</b>      | <b>37</b>  |
| 4.1 Impredicative Polymorphisms and Recursive Specifications . . . . . | 38         |
| 4.2 Solving Reynolds's ECP Problem . . . . .                           | 42         |
| 4.3 Recursive Specifications . . . . .                                 | 47         |

|          |                                                                    |            |
|----------|--------------------------------------------------------------------|------------|
| 4.4      | Discussion . . . . .                                               | 48         |
| <b>5</b> | <b>Weak Updates and a Translation from Typed Assembly Language</b> | <b>51</b>  |
| 5.1      | Weak Update in the Logical Setting . . . . .                       | 52         |
| 5.2      | Typed Assembly Language (TAL) . . . . .                            | 54         |
| 5.3      | A “Semantic” TAL Language . . . . .                                | 58         |
| 5.4      | Translations from TAL/STAL to XCAP . . . . .                       | 61         |
| 5.5      | Discussion . . . . .                                               | 64         |
| <b>6</b> | <b>A Port to x86 Machine</b>                                       | <b>67</b>  |
| 6.1      | Mini86: a Subset of the x86 Architecture . . . . .                 | 67         |
| 6.2      | XCAP86: a Port of XCAP on Mini86 . . . . .                         | 71         |
| 6.3      | Verification of a polymorphic queue module . . . . .               | 80         |
| <b>7</b> | <b>A Certified Mini Thread Library</b>                             | <b>85</b>  |
| 7.1      | MTH: A Mini Thread Library . . . . .                               | 86         |
| 7.2      | Verification of the Machine Context Module . . . . .               | 90         |
| 7.3      | Verification of the Threading Module . . . . .                     | 96         |
| 7.4      | Discussion . . . . .                                               | 107        |
| <b>8</b> | <b>The Coq Implementations</b>                                     | <b>109</b> |
| 8.1      | Implementation of <i>PropX</i> . . . . .                           | 110        |
| 8.2      | Implementation of XCAP . . . . .                                   | 114        |
| 8.3      | Implementation of XCAP86 . . . . .                                 | 116        |
| 8.4      | Implementation of MTH . . . . .                                    | 119        |
| <b>9</b> | <b>Conclusion and Future Work</b>                                  | <b>121</b> |
| 9.1      | Conclusion . . . . .                                               | 121        |
| 9.2      | Trusted Computing Base . . . . .                                   | 122        |
| 9.3      | Comparison with the Indexed Approach . . . . .                     | 125        |

|                                                            |            |
|------------------------------------------------------------|------------|
| 9.4 Future Work . . . . .                                  | 127        |
| <b>A PropX Validity Soundness Proof</b>                    | <b>133</b> |
| <b>B XCAP Soundness Proof</b>                              | <b>141</b> |
| <b>C TAL to XCAP Translation Typing Preservation Proof</b> | <b>147</b> |
| <b>D XCAP86 Soundness Proof</b>                            | <b>155</b> |
| <b>Bibliography</b>                                        | <b>163</b> |



# List of Figures

|     |                                                                               |    |
|-----|-------------------------------------------------------------------------------|----|
| 2.1 | Syntax of target machine . . . . .                                            | 12 |
| 2.2 | Operational semantics of target machine . . . . .                             | 13 |
| 2.3 | Mechanized meta logic . . . . .                                               | 14 |
| 2.4 | Assertion language of CAP . . . . .                                           | 17 |
| 2.5 | Inference rules of CAP . . . . .                                              | 18 |
| 3.1 | Extended propositions . . . . .                                               | 29 |
| 3.2 | Interpretations of extended propositions . . . . .                            | 29 |
| 3.3 | Assertion language of XCAP . . . . .                                          | 31 |
| 3.4 | Inference rules of XCAP . . . . .                                             | 32 |
| 4.1 | Validity rules for impredicative extended propositions . . . . .              | 39 |
| 4.2 | Code, specification, and illustration of the list append function . . . . .   | 45 |
| 4.3 | Code, specification, and illustration of the list append function (continued) | 46 |
| 5.1 | Type definitions of TAL . . . . .                                             | 55 |
| 5.2 | Top-level static semantics of TAL . . . . .                                   | 56 |
| 5.3 | Static semantics of TAL (subtyping and instruction typing) . . . . .          | 57 |
| 5.4 | Static semantics of TAL (state and value typing) . . . . .                    | 59 |
| 5.5 | Translations from TAL types to XCAP predicates . . . . .                      | 62 |
| 6.1 | Mini86 execution environment . . . . .                                        | 68 |
| 6.2 | Mini86 syntax . . . . .                                                       | 69 |

|      |                                                                              |     |
|------|------------------------------------------------------------------------------|-----|
| 6.3  | Dynamic semantics of Mini86 . . . . .                                        | 70  |
| 6.4  | Syntax of XCAP86 . . . . .                                                   | 72  |
| 6.5  | Validity rules of XCAP86 extended propositions . . . . .                     | 73  |
| 6.6  | Inference rules of XCAP86 . . . . .                                          | 75  |
| 6.7  | Function calling convention . . . . .                                        | 78  |
| 6.8  | Verification of queue insertion . . . . .                                    | 82  |
| 6.9  | Verification of queue deletion . . . . .                                     | 83  |
| 7.1  | Module structure and pseudo-C specification of MTH . . . . .                 | 87  |
| 7.2  | Threading model of MTH . . . . .                                             | 88  |
| 7.3  | Machine context . . . . .                                                    | 90  |
| 7.4  | Verification of machine context switching . . . . .                          | 93  |
| 7.5  | Verification of machine context loading . . . . .                            | 94  |
| 7.6  | Verification of machine context creation . . . . .                           | 95  |
| 7.7  | Thread control blocks and queues of MTH . . . . .                            | 96  |
| 7.8  | Threading invariant of MTH . . . . .                                         | 97  |
| 7.9  | Verification of thread yielding . . . . .                                    | 100 |
| 7.10 | Verification of thread creation . . . . .                                    | 101 |
| 7.11 | Verification of thread creation (continued) . . . . .                        | 102 |
| 7.12 | Verification of thread termination . . . . .                                 | 103 |
| 7.13 | Verification of thread initialization . . . . .                              | 104 |
| 7.14 | Verification of thread scheduler . . . . .                                   | 105 |
| 7.15 | Verification of thread scheduler (continued) . . . . .                       | 106 |
| A.1  | Assertion language of the full-featured XCAP . . . . .                       | 134 |
| A.2  | Validity rules for extended propositions of the full-featured XCAP . . . . . | 135 |
| A.3  | Normal natural deduction validity rules . . . . .                            | 137 |
| A.4  | Sequent style validity rules . . . . .                                       | 138 |

# Acknowledgments

The first person I want to thank is my advisor, Professor Zhong Shao. Entering the PhD program six years ago, I had little idea about what it would actually take to finish this dissertation. It is Professor Shao's encouraging advices and thoughtful guidance that helped and supported my pursuit of this exciting research. He patiently taught me knowledge and trained my skills starting with the very basics. From day one, he has been amazingly good at pushing me towards working with him more as a colleague than as a student. I remember and appreciate all the lively discussions, exciting discoveries, as well as disappointing failures and sometimes heated debates I shared with him during the past six years. I truly learned a lot from his broad vision on research, persistency in deep understanding, and ever-lasting enthusiasm.

I have been very lucky to have overlap with every other member of the FLINT group in its first ten years. Valery Trifonov was extremely helpful to my PhD study, especially during the first few years. Bratin Saha, Christopher League, and Stefan Monnier answered many of my naïve questions in the early years. I had close and pleasant collaborations with Dachuan Yu, Nadeem A. Hamid, and Xinyu Feng. Entering the PhD program in the same year, Hai Fang and I discussed and exchanged our experiences and frustrations during the various stages of the study. I also enjoyed the discussions with Andrew McCreight, Rodrigo Ferreira, and Alexander Vaynberg.

---

<sup>0</sup>This research is based on work supported in part by gifts from Intel and Microsoft, DARPA OASIS grant F30602-99-1-0519, NSF grant CCR-9901011, NSF ITR grant CCR-0081590, and NSF grant CCR-0524545. Any opinions, findings, and conclusions contained in this document are those of the authors and do not reflect the views of these agencies.

I would like to thank Professor Paul Hudak, Professor Carsten Schürmann, and Professor David Walker for serving on my thesis committee and being my thesis readers. Professor Paul Hudak taught me a formal semantics course and has been very helpful to my study in the PhD program. Professor Carsten Schürmann taught me my first formal method course and gave me many advices, especially in my first year. Professor David Walker gave me many instructional advices and had very helpful interactions with me on my research progress.

I would also like to thank Professor Drew McDermott and Professor Yang R. Yang for their help in my graduate study. I want to thank Linda Dobb and Judy Smith, who helped me with all the administrative stuffs.

Finally, I would like to thank my parents and my family, without the support from which the completion of my graduate study and this dissertation would be impossible. I am especially grateful to my wife, Zhiyan, whose love, care, and support are my source of energy along the journey.

# Chapter 1

## Introduction

We begin this dissertation with the following code:

```
swapcontext:
    ; save old context          ; load new context
    mov eax, [esp+4]            mov eax, [esp+8]
    mov [eax+eax], 0            mov esp, [eax+esp]
    mov [eax+ebx], ebx          mov ebp, [eax+ebp]
    mov [eax+ecx], ecx          mov edi, [eax+edi]
    mov [eax+edx], edx          mov esi, [eax+esi]
    mov [eax+esi], esi          mov edx, [eax+edx]
    mov [eax+edi], edi          mov ecx, [eax+ecx]
    mov [eax+ebp], ebp          mov ebx, [eax+ebx]
    mov [eax+esp], esp          mov eax, [eax+eax]
                                ret
```

This 19 lines of x86 assembly code constitutes a common routine for machine-context switching, omitting floating-point and special registers. The first (left) half of the code saves the current machine context, whereas the second (right) half loads the new machine context and resumes execution from there. Being executed virtually every millisecond on most PCs, code sequences similar to this are a crucial part of modern OS.

Surprisingly, despite the simplicity and small size of this piece of code, to the author's knowledge, it has never been rigorously proved to be safe and correct, whether manually or automatically. There is not even a rigorous statement on what this 19 lines of code should and should not do. In fact, this seemingly simple code sequence turns out to be extremely difficult to specify and reason about. The lack of safety guarantee at such a core of OS kernel makes any claim about software safety and security a "wish" instead of reality.

## 1.1 Machine Code Verification

Program correctness or, to a less degree, code safety, has long been one of the most desired goals for computer programmers and users. With great dependency on information technology in the modern world, individuals and businesses are getting more concerned about the reliability and the security of the computing infrastructure than ever. Great efforts and investments have been put into the research and development of software verification tools for a long time. In [32], formally establishing safety properties of software has been identified as a grand challenge to the computer science community.

From the point of view of software engineering, there are many levels of program abstractions. At the top level are algorithms and protocols, which normally get implemented in various high-level programming languages (including domain-specific languages), the later of which often get translated into and are supported by intermediate languages found either in compilers or in portable virtual machines. In the end there is assembly and machine code, which carries out the actual computations and provides critical runtime support.

Surprisingly, in the entire hierarchy of program verification there is one important level that receives much less attention than the others. Despite the fact that all higher level abstraction and verification efforts will eventually be based upon the machine code level, there have been much fewer existing works done in machine code verification. The direct result of the lack of machine code level verification support is that most of the safe algorithms/protocols/programs usually lose their safety guarantees when being implemented or translated into binary.

Recently, formal studies on critical code such as OS kernels are attracting growing interests. Among other techniques, type systems and program logics have been widely applied in modern programming language and OS researches. One representative example is the Singularity project [33], which aims to build a highly reliable OS using a type-safe programming language (C#) and other techniques for program specification and verifi-

cation. Another example is the Verisoft project [23], which uses computer-aided logical proofs to obtain assurance of the correctness of critical systems including OS kernels. Unfortunately, both fall short at the code for context switching at the beginning of this chapter: Singularity uses unsafe assembly code for programming context switching; Verisoft has not approached below the level of user processes in software verification.

The lack of work on self-contained verification of machine level context and thread implementation reveals subtle limitations of traditional methods. For example, although type systems provide excellent support for modularity and higher-order features, they tend to be very specialized when dealing with low-level system code, raising both flexibility and interoperability issues. As another example, many program logics suffer from the weak support on higher-order features such as embedded code pointers [53], as will be discussed in details in this dissertation.

The purpose of this thesis work is to advance the research of machine code verification. In particular, we want to achieve a better balance between expressiveness and modularity over existing methods for low-level system and application code verification. Although theoretically one could apply any verification methods on machine code, expressiveness and modularity are two extremely important criteria for machine code verification.

One of the main reasons that people write code in low-level languages instead of high-level languages is the expressiveness of the low-level languages themselves. The speed, flexibility, and precise control of machine resources are examples of the expressiveness of machine languages. To verify such kind of code, the expressiveness of the verification system has to be much more powerful than those used for high-level languages.

Separate compilation is important for high-level programs. While machine code does not need to be compiled, in many cases it still needs to be linked together before execution. Moreover, separate verification is even more important for machine code than high-level programs, since machine code is generally larger in size and more tedious than a corresponding high-level program. Its verification condition and process are also generally much larger and slower than the high-level's. For a machine code verification system to

be practical and efficient, it must have as good support of modularity as the high-level verification systems.

## 1.2 Previous Work

In this section, we discuss existing efforts on low-level and machine code verification.

**Typed assembly languages.** Type system has been popular and proved successful in programming languages. Thus, a natural way to address the machine and assembly level program verification is to design a type system for assembly language. Partly inspired by the Typed Intermediate Language (TIL) [57], Type Assembly Language (TAL) [41] is a first effort to address the low-level code safety problem in the traditional type system manner. The specification of TAL is written in syntactically defined types, including integer, tuple, code pointer, polymorphic, and existential types. Since the type checking of TAL is syntax directed and decidable, the verification process is fully automatic. By using certifying compilation technology, many of the existing high-level programming languages can be translated into TAL code with type annotation. Later improvement of TAL includes stack and exception [40], modularity and dynamic linking [24], dependent type and arrays [60], recursive type and connections with foundational proof-carrying code [29], heterogeneous tuples and disjoint sums [13], low-level optimization [11].

However, there are lot safety policies which have not been supported by the above work on TAL. For example, memory management are not handle in any of them. The fact that most of TAL languages require a hard-wired “malloc” or “alloc” instruction, which is generally a library function above the machine instruction level, is due to the inability of them to type check precise memory management. To support these safety properties in TAL, one would need to incorporate more and more features into the type system and the interoperations between different versions of TALs may also be problematic.

**Proof-carrying code.** Traditionally, the code consumer receives binary from the code producer and verify the code before executing it. To shift the burden of verification from code consumer to code producer, Proof-Carrying Code (PCC) [45, 43, 44, 12] allows a code producer to provide a machine (DEC Alpha) language program to a host along with a formal proof of its safety. The specification language in PCC is typing predicates with some universal logic connectors and the safety policy is expressed as pre/post conditions of code sequences. The proofs are written in a logic extended with many language-specific typing rules. By using verification-condition generator (VCgen), both the code producer and the code consumer can automatically calculate the safety requirements (lemmas) at each instruction from the safety policy. For those simple examples, the theorem prover can automatically prove those lemmas. Code consumer would simply use a proof checker to ensure that the proofs do prove those lemmas. By proving the soundness of the VCgen algorithm, the code consumer can be sure that the safety policy has been satisfied. Since the source language’s type system is somehow embedded in the logic, PCC can enjoy most of the benefits of source type systems such as modularity and support more general properties through logic formulas.

The VCgen and proof checker are inside the trusted computing base and are error-prone since they all involve language-specific typing rules which themselves are not always error-free. For example, [36] discovered a serious bug in the typing rules. Another difficulty of PCC is that language-specific typing rules makes interoperation between code written in different languages ad hoc, if not hard at all.

**Foundational proof-carrying code.** Foundational Proof-Carrying Code (FPCC) [6, 5] tackles the problems of PCC by constructing and verifying its proofs using strictly the foundations of mathematical logic, with no type-specific axioms. The machine semantics of Sun Sparc is formalized in the logic in the first place. VCgen can now be removed from trusted computing base. Instead of directly verifying binaries in the machine level, exiting work of FPCC all takes programs written in a slightly higher-level TAL-like language

and automatically general their corresponding FPCC according to the types and typing derivations. In the case of semantic approach [6, 18, 3, 8] types, typing derivations, and soundness proof are directly interpreted into foundational logic’s formula and proofs. The syntactic approach [29] simply encodes the syntactic type system in the logic and obtain the FPCC by mapping between machine and FTAL steps and utilizing the encoded FTAL soundness proof.

**Automated proofs of object code for a widely used microprocessor.** Yu [65] uses a limited first-order predicate logic (ACL) to define and specify the machine (Motorola MC68020) and safety policy (Hoare-style pre/post-condition relation on binary encoding of machine program segment). The proof of the safety properties is done by manually specifying lemmas needed and their usage, and letting the Boyer-Moore Theorem Prover (Nythm) to complete the whole proof. There the safety properties are mostly correctness of functions, including those in a unix string library.

**Cyclone / C-Cured / Vault.** Cyclone [35], CCured [46], and Vault [16] are safe C-like languages which aim at providing verification support at a level close to assembly. They could serve as the low-level system programming language connecting with machine code verification system when a very expressive specification language is not required.

**JVML / MSIL.** Java Virtual Machine Language [27] and Microsoft Intermediate Language [37] are intermediate-level portable languages for virtual machines. All Java and C# programs are compiled to and distributed in these languages. These languages can be further compiled to TAL-like languages.

### 1.3 Challenges and Contributions

The work in this dissertation lies within the PCC (and FPCC) framework. In principle, PCC can be used to verify safety properties of arbitrary machine-language programs.

Existing PCC systems [12, 29, 11, 13], however, have focused on programs written in type-safe languages [26, 38] or variants of typed assembly languages [41]. Type-based approaches are attractive because they facilitate automatic generation of the safety proofs (by using certifying compilers) and provide great support to modularity and higher-order language features. But they also suffer from several serious limitations. First, types are not expressive enough to specify sophisticated invariants commonly seen in the verification of low-level system software. Recent works on *logic-based type systems* [61, 55, 14, 1, 4] have made types more expressive but they still cannot specify advanced state invariants and assertions [54, 63, 64] definable in a general-purpose predicate logic with inductive definitions [58]. Second, type systems are too weak to prove advanced properties and program correctness, especially in the context of concurrent assembly code [63, 20]. Finally, languages of different style often require different type systems, making it hard to reason about interoperability.

An alternative to type-based methods is to use Hoare logic [22, 31]—a widely applied technique in program verification. Hoare logic supports formal reasoning using very expressive assertions and inference rules from a general-purpose logic. In the context of foundational proof-carrying code (FPCC) [5, 63], the assertion language is often unified with the mechanized meta logic (following Gordon [25])—proofs for Hoare logic consequence relation and Hoare-style program derivations are explicitly written out in a proof assistant. For example, Touchstone PCC [44] used Hoare-style assertions to express complex program invariants. Appel *et al* [6, 9] used Hoare-style state predicates to construct a general semantic model for machine-level programs. Yu *et al* [63, 64, 20] recently developed a certified assembly programming framework that uses Hoare-style reasoning to verify low-level system libraries and general multi-threaded assembly programs.

Unfortunately, Hoare logic, as Reynolds [54] observed, does not support higher-order features such as embedded code pointers (ECPs) well. Similarly, features such as impredicative polymorphisms, recursive specifications, and weak update references, are not handled well simultaneously in the logical setting. How to modularly support these fea-

tures without sacrificing the expressive power of logic is the first and major challenge we face when doing machine code verification.

The second challenge we face when doing machine code verification is whether the verification method is applicable to real-world program verification on realistic system kernel code.

The third challenge is whether the results of verification is an isolated one or can actually connect with other verification systems such as type systems.

The contributions of this dissertation can be summarized into the following aspects, which corresponds to the structure of the dissertation.

**Embedded code pointer and the basic XCAP framework.** We designed the XCAP framework, which provides the first simple and general solution to the ECP problem for Hoare-logic verification systems. By “simple”, we mean that our method does not alter the structure of Hoare-style program derivations and assertions. By “general”, we mean that our technique can indeed handle all kinds of machine-level ECPs, including those hidden inside higher-order closures. Relevant details can be found in Chapter 3.

**Impredicative polymorphisms and recursive specifications.** We extended the XCAP framework to support impredicative polymorphisms and recursive specifications. Both extensions cause very little change in the XCAP inference rules and meta theory. Thus, they are light weight and proved sound. Using the extended XCAP, we solve the ECP problem for separation logic [54] and present a verification of a destructive list-append function, which is listed as an example in [54]. Relevant details can be found in Chapter 4.

**Weak update and a translation from typed assembly language.** Weak update, also termed as “general reference”, is another higher-order feature that logic-based verification methods fail to support well. We once again extended the XCAP framework to support weak update, using similar syntactic technique for the support of ECP in Chapter 3.

We then explore the relationship between XCAP and typed assembly languages (TAL). TAL and XCAP are suitable for different kinds of verification tasks. Previously, programs verified in either one of them can not interoperate freely with the other, making it hard to integrate them into a complete system. Moreover, the relationship between TAL and CAP/XCAP lines of work has not been discussed extensively.

We compare the type-based and logic-based methods by presenting a type-preserving translation from a TAL language to XCAP. The translation involves an intermediate step of a “semantic” TAL language. Our translation supports polymorphic code, mutable reference, existential, and recursive types. Since we proved typing preservation for the translation from TAL to XCAP, there is a clear path to link and interoperate well-typed programs from traditional certifying compilers with certified libraries by XCAP. Relevant details can be found in Chapter 5.

**A port of XCAP to the x86 architecture.** We port XCAP to a faithful subset of the x86 architecture, which adds the support of instruction decoding, finite machine word, word-aligned byte-addressed memory, conditional flags, built-in stack and push/pop instructions, and function call/return instructions. On top of it, we made practical adaptations and built useful abstractions, particularly on the handling of the stack and function calls. We demonstrate its usage and these abstractions for the verification of a polymorphic queue module. Relevant details can be found in Chapter 6.

**A certified mini thread library.** We mechanically verified a thread implementation at the machine level using the ported XCAP. Using the first mechanized proof for the safety of a machine-level thread implementation, we demonstrate the power and practicality of the XCAP framework, and thus the fact that the certification of complex machine-level system code is not beyond reach. The specifications and proof of MTH modules and routines are modular. Each piece of the code is specified and proved with minimal reference to external code. For example, in the verification of context module, there is no mention

of thread at all. More details can be found in Chapter 7.

**Mechanization.** One key feature of the XCAP framework is mechanization. Not only did we mechanize our machine syntax, machine semantics, assertion languages, interpretations, inference rules, and program specifications and proofs in a general mathematic logic, we also mechanized the complete meta theory of XCAP. Plus, we want to directly obtain the power of higher-order predicate logic, through a shallow embedding. For these reasons, our usage of the Coq proof assistant [58] is to treat it as both a mechanized logic framework and a mechanized meta logic framework. More details can be found in Chapter 8.

# Chapter 2

## Background

Most of the theoretical presentations and discussions in this dissertation are based on a same RISC-like ideal target machine (TM). Most of the verification systems and proofs in this dissertation assume a same underlying formal meta logic—a variant of the Calculus of Inductive Constructions (CiC) [50], upon which the Coq proof assistant is based. For proof mechanization purpose, we refer to it as our “mechanized meta logic”. Certified Assembly Programming (CAP) is the logic-based machine code verification framework from which the work in this dissertation evolves. In this chapter, we prepare the reader with the machine, the logic, and the baseline CAP.

### 2.1 The Target Machine

All theoretical results presented in this dissertation share a common raw target machine TM, as defined in Figure 2.1. A TM program ( $\mathbb{P}$ ), an entire machine configuration, consists of a code heap ( $\mathbb{C}$ ), a dynamic state component ( $\mathbb{S}$ ) made up of a register file ( $\mathbb{R}$ ) and a data heap ( $\mathbb{H}$ ), and an instruction sequence ( $\mathbb{I}$ ) to be executed next. Code heap is a collection of code labels ( $f$ ) and the code sequence they points to. The register file is made up of 32 registers, while the data heap is a partial mapping from data labels ( $l$ ) to machine words ( $w$ ). In essence,  $f$ ,  $l$ , and  $w$  are all just plain natural numbers ( $i$ ). Since TM has an infinite

|                       |              |                                                                                                                                                                                                                                                              |
|-----------------------|--------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <i>(Program)</i>      | $\mathbb{P}$ | $::= (\mathbb{C}, \mathbb{S}, \mathbb{I})$                                                                                                                                                                                                                   |
| <i>(CodeHeap)</i>     | $\mathbb{C}$ | $::= \{f \rightsquigarrow \mathbb{I}\}^*$                                                                                                                                                                                                                    |
| <i>(State)</i>        | $\mathbb{S}$ | $::= (\mathbb{H}, \mathbb{R})$                                                                                                                                                                                                                               |
| <i>(Mem)</i>          | $\mathbb{H}$ | $::= \{l \rightsquigarrow w\}^*$                                                                                                                                                                                                                             |
| <i>(Regfile)</i>      | $\mathbb{R}$ | $::= \{r \rightsquigarrow w\}^*$                                                                                                                                                                                                                             |
| <i>(Reg)</i>          | $r$          | $::= \{r_k\}_{k \in \{0 \dots 31\}}$                                                                                                                                                                                                                         |
| <i>(Word, Labels)</i> | $w, f, l$    | $::= i \text{ (nat nums)}$                                                                                                                                                                                                                                   |
| <i>(InstrSeq)</i>     | $\mathbb{I}$ | $::= c; \mathbb{I} \mid \text{jd } f \mid \text{jmp } r$                                                                                                                                                                                                     |
| <i>(Instr)</i>        | $c$          | $::= \text{add } r_d, r_s, r_t \mid \text{addi } r_d, r_s, i \mid \text{alloc } r_d, i \mid \text{bgti } r_s, i, f$<br>$\mid \text{free } r_s, i \mid \text{ld } r_d, r_s(i) \mid \text{mov } r_d, r_s \mid \text{movi } r_d, i \mid \text{st } r_d(i), r_s$ |

Figure 2.1: Syntax of target machine

word size, we expect its data heap to be bounded but infinite—there is no upper limit on how large a data heap can be, but at any program execution point, all the allocated heap cells are below a certain boundary (this is because in TM one can only allocate a finite amount of memory in each allocation).

TM follows the RISC style for simplicity. Its instruction set of TM is minimal but extensions are straightforward. There are instructions (c) for arithmetic operations (`add rd, rs, rt`, `addi rd, rs, i`), data movements (`mov rd, rs` and `movi rd, i`), memory allocation/deallocation (`alloc rd, i` and `free rs, i`), memory accesses (`ld rd, rs(i)` and `st rd(i), rs`), and condition jumps (`bgti rs, i, f`). Each code block (instruction sequence) must end with either a direct jump (`jd f`) or an indirect jump (`jmp r`).

The operational semantics of this language (see Figure 2.2) should pose no surprise. The add instructions add values from source registers and immediate values and put the sums into the destination registers. The data movement instructions take either the source register value or an immediate value, and put it into the destination register. Heap allocation instruction takes a required size *i* and allocates a continuous memory block of that size, with its initial value undefined (so for safe execution of programs, these values should not be used). To dispose a piece of memory, every cell in it must be previously allocated. Similarly, to access a memory cell, it must be previously allocated. Whether

|                                      |                                                                                                                                                                             |
|--------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <b>if</b> $\mathbb{I} =$             | <b>then</b> $(\mathbb{C}, (\mathbb{H}, \mathbb{R}), \mathbb{I}) \longmapsto$                                                                                                |
| <b>jd</b> $f$                        | $(\mathbb{C}, (\mathbb{H}, \mathbb{R}), \mathbb{C}(f))$ when $f \in \text{dom}(\mathbb{C})$                                                                                 |
| <b>jmp</b> $r$                       | $(\mathbb{C}, (\mathbb{H}, \mathbb{R}), \mathbb{C}(\mathbb{R}(r)))$ when $\mathbb{R}(r) \in \text{dom}(\mathbb{C})$                                                         |
| <b>bgti</b> $r_s, i, f; \mathbb{I}'$ | $(\mathbb{C}, (\mathbb{H}, \mathbb{R}), \mathbb{I}')$ when $\mathbb{R}(r_s) \leq i$ ;<br>$(\mathbb{C}, (\mathbb{H}, \mathbb{R}), \mathbb{C}(f))$ when $\mathbb{R}(r_s) > i$ |
| <b>c; <math>\mathbb{I}'</math></b>   | $(\mathbb{C}, \text{Next}_c(\mathbb{H}, \mathbb{R}), \mathbb{I}')$                                                                                                          |

where

|                            |                                                                                                                                                                                                    |
|----------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <b>if</b> $c =$            | <b>then</b> $\text{Next}_c(\mathbb{H}, \mathbb{R}) =$                                                                                                                                              |
| <b>add</b> $r_d, r_s, r_t$ | $(\mathbb{H}, \mathbb{R}\{r_d \rightsquigarrow \mathbb{R}(r_s) + \mathbb{R}(r_t)\})$                                                                                                               |
| <b>addi</b> $r_d, r_s, i$  | $(\mathbb{H}, \mathbb{R}\{r_d \rightsquigarrow \mathbb{R}(r_s) + i\})$                                                                                                                             |
| <b>mov</b> $r_d, r_s$      | $(\mathbb{H}, \mathbb{R}\{r_d \rightsquigarrow \mathbb{R}(r_s)\})$                                                                                                                                 |
| <b>movi</b> $r_d, i$       | $(\mathbb{H}, \mathbb{R}\{r_d \rightsquigarrow i\})$                                                                                                                                               |
| <b>alloc</b> $r_d, i$      | $(\mathbb{H}\{1 \rightsquigarrow \_, \dots, 1+i-1 \rightsquigarrow \_}, \mathbb{R}\{r_d \rightsquigarrow 1\})$<br>where $1, \dots, 1+i-1 \notin \text{dom}(\mathbb{H})$ and $\_$ is a random value |
| <b>free</b> $r_s, i$       | $(\mathbb{H}/\{\mathbb{R}(r_s), \dots, \mathbb{R}(r_s) + i-1\}, \mathbb{R})$<br>when $\mathbb{R}(r_s), \dots, \mathbb{R}(r_s) + i-1 \in \text{dom}(\mathbb{H})$                                    |
| <b>ld</b> $r_d, r_s(i)$    | $(\mathbb{H}, \mathbb{R}\{r_d \rightsquigarrow \mathbb{H}(\mathbb{R}(r_s) + i)\})$ when $\mathbb{R}(r_s) + i \in \text{dom}(\mathbb{H})$                                                           |
| <b>st</b> $r_d(i), r_s$    | $(\mathbb{H}\{\mathbb{R}(r_d) + i \rightsquigarrow \mathbb{R}(r_s)\}, \mathbb{R})$ when $\mathbb{R}(r_d) + i \in \text{dom}(\mathbb{H})$                                                           |

Figure 2.2: Operational semantics of target machine

directly jumping to an immediate code label or a register value, the target address must point to a piece of code in the code heap. Branch instruction compares the register value with an immediate value, and decides if it should continue with the next instruction or jump to somewhere else. In all the above cases, when the side-conditions of instructions are unsatisfied, the execution of TM gets “stuck.” To guarantee that a TM program will never enter the “stuck” status is the minimum requirement for any verification systems.

## 2.2 Mechanized Meta Logic

Both the syntax and the operational semantics of TM, as well as all the verifications systems, program specifications, and proofs in this dissertation, are defined upon a formal meta logic. In this dissertation we use a variant of the calculus of inductive constructions (CiC) [50], which is a higher-order predicate logic extended with powerful inductive def-

$$\begin{aligned}
(Term) \ A, B ::= & \ Set \mid Prop \mid Type \mid x \mid \lambda x:A. B \mid A \ B \mid A \rightarrow B \mid \Pi x:A. B \mid \text{inductive definitions} \\
(Prop) \ p, q ::= & \ True \mid False \mid \neg p \mid p \wedge q \mid p \vee q \mid p \supset q \mid \forall x:A. p \mid \exists x:A. p \mid \dots
\end{aligned}$$

Figure 2.3: Mechanized meta logic

initions. We informally present our formal meta logic in Figure 2.3. Terms in CiC can be sorts (Set, Prop, and Type), variables, function abstractions, function applications, non-dependent products, dependent products, or inductive definitions. We omit the details of inductive definitions and will instead use examples to explain them later. The logic part of CiC contains terms of Prop sort, which provides common logic quantifiers and connectives, and allows user expansions by inductively defined propositions.

Since one goal of the work in this dissertation is to generate machine-checkable proofs, we want our formal meta logic to be a mechanized one. We select CiC based on the fact that it is the underlying logic theory for the Coq proof assistant [58]. We use Coq for two purposes. The first is to use it as a mechanized meta logic framework, in which we can represent all the syntax, rules, and meta theory of our machine and verification systems in the *Set* and *Type* sort of Coq. For example, to represent TM, machine state can be embedded as a *State* type (which has *Set* sort in Coq); registers, machine instruction sequences, commands, and execution semantics can be defined as inductive definitions.

```

Inductive Register : Set := r0 | r1 | ... | r31.

Inductive Command : Set := add : Register -> Register -> Register -> Command
| addi : Register -> Register -> Word      -> Command
| mov : Register -> Register             -> Command
| ... .

Inductive InstrSeq : Set := iseq : Command -> InstrSeq -> InstrSeq
| jd   : Word           -> InstrSeq
| jmp  : Register       -> InstrSeq
| ... .

Inductive Next : Command -> State -> State -> Prop :=
| stp_add : forall rd rs rt H R,
  Next (add rd rs rt) (H, R) (H, uR R rd (R rs + R rt))
| stp_mov : forall rd rs H R,
  Next (mov rd rs) (H, R) (H, uR R rd (R rs))
| stp_ld  : forall rd rs w H R w',
  lookup H (R rs + w) w' ->
  Next (ld rd rs w) (H, R) (H, uR R rd w')
| ... .

Inductive STEP : Program -> Program -> Prop :=
| stp_iseq : forall C S S' c I,
  Next c S S' ->
  STEP (C, (S, iseq c I)) (C, (S', I))
| stp_jd   : forall C S l I,
  lookup C l I ->
  STEP (C, (S, jd l)) (C, (S, I))
| stp_jmp  : forall C S r I,
  lookup C (R S r) I ->
  STEP (C, (S, jmp r)) (C, (S, I))
| ... .

```

General safety policies can then be defined as meta logic predicates over the entire machine configuration; they will have the *Program*  $\rightarrow$  *Prop* type. For example, the simple “non-stuckness” safety policy for TM can be defined as follows:

$$\lambda \mathbb{P}. \forall n: \text{Nat}. \exists \mathbb{P}': \text{Program}. \mathbb{P} \xrightarrow{n} \mathbb{P}'.$$

Here  $\xrightarrow{n}$  is the composition of “ $\xrightarrow{}$ ” for  $n$  times. Under this setting, if *Safe()* denotes a particular customized safety policy, a certified binary is just a pair of program  $\mathbb{P}$  together with a proof object of type *Safe*( $\mathbb{P}$ ), all represented in the mechanized meta logic.

The second usage of CiC/Coq is to directly use its build-in higher-order predicate logic (*Prop*) to serve as (part of) the assertion logics, which are used to write program specifi-

cations. Thus there is no need to define a complete new general logic and its meta theory, which makes the approaches in this paper lightweight. In other word, we use shallow embedding instead of deep embedding for the higher-order predicate logic part of the assertion languages in this dissertation. Actually, right in the next section, we will see how the original CAP utilizes this technology and becomes an extremely simple framework.

It is important to note that, although the theory and implementation in this dissertation are presented and done with CiC/Coq, we believe that the key ideas from our results are applicable to other mechanized meta logics.

## 2.3 The Certified Assembly Programming Framework

As suggested by its name, certified assembly programming, CAP [63] is a logic-based verification framework for machine code verification. In essence, CAP is a Hoare-logic framework for reasoning about assembly programs. Note that the CAP framework presented in this section is slightly different from the original one, as we want to let it share a very similar structure with the work in this dissertation.

Traditionally, Hoare-logic systems usually use the following judgment to reason about program safety and correctness.

$$\{precondition\} \ statement \ \{postcondition\}$$

Where pre- and post-conditions are written in certain assertion logic, and describe the state before and after the execution of *statement*.

In this dissertation, however, TM programs are often written in continuation-passing style [7], as there are no instructions directly in correspondence with function call and return in a high-level language. Hence post-conditions in Hoare logic do not have explicit counterparts in CAP; they are often interpreted as preconditions of the return continuations. The basic form of judgment in CAP is

$$\{a\} \ \mathbb{I}$$

$$\begin{aligned}
(CdHpSpec) \quad \Psi &::= \{f \rightsquigarrow a\}^* \\
(Assertion) \quad a &\in State \rightarrow Prop \\
(AssertImp) \quad a \Rightarrow a' &\triangleq \forall \mathbb{S}. a \mathbb{S} \supset a' \mathbb{S} \\
(StepImp) \quad a \Rightarrow_C a' &\triangleq \forall \mathbb{S}. a \mathbb{S} \supset a' \text{Next}_C(\mathbb{S})
\end{aligned}$$

Figure 2.4: Assertion language of CAP

where  $\mathbb{I}$  is the code block to be reasoned about, and  $a$  is an assertion describing the expectation on machine states before executing  $\mathbb{I}$ .

**Assertion language.** Following Gordon [25], CAP’s assertion language, as presented in Figure 2.4, is directly unified with the underlying mechanized meta logic (*i.e.*, shallow embedding). CAP assertions ( $a$ ) tracks the state component in the machine configuration, so any terms of type  $State \rightarrow Prop$  are valid CAP assertions. For example, an assertion specifying that the registers  $r_1$  and  $r_2$  store a same value and the register  $r_3$  contains a non-NULL value can be written as:

$$\lambda(\mathbb{H}, \mathbb{R}). \mathbb{R}(r_1) = \mathbb{R}(r_2) \wedge \mathbb{R}(r_3) \neq \text{NULL}.$$

To simplify the presentation, we lift propositional implication ( $\supset$ ) to assertion level ( $\Rightarrow$ ).

In Figure 2.4, there is a construct named code heap specification ( $\Psi$ ) for expressing user-defined safety requirements on program. A code heap specification associates every code label with an assertion, with the intention that the pre-condition of a code block is described by the corresponding assertion.

**Inference rules.** CAP defines a set of inference rules for proving judgments for well-formed programs, code heaps, and instruction sequences (see Figure 2.5).

Top-down, a TM program is well-formed (rule PROG) under assertion  $a$  if both the global code heap and the current instruction sequence are well-formed and the machine state satisfies assertion  $a$ .

To support separate verification of code modules, we have made some changes to

|                                                                                                                                                                                                                                                                                                                                                                                          |                              |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------|
| $\boxed{\Psi_G \vdash \{a\} \mathbb{P}}$                                                                                                                                                                                                                                                                                                                                                 | <b>(Well-formed Program)</b> |
| $\frac{\Psi_G \vdash \mathbb{C} : \Psi_G \quad (a \mathbb{S}) \quad \Psi_G \vdash \{a\} \mathbb{I}}{\Psi_G \vdash \{a\} (\mathbb{C}, \mathbb{S}, \mathbb{I})} \text{ (PROG)}$                                                                                                                                                                                                            |                              |
| $\boxed{\Psi_{IN} \vdash \mathbb{C} : \Psi}$ <b>(Well-formed Code Heap)</b>                                                                                                                                                                                                                                                                                                              |                              |
| $\frac{\Psi_{IN} \vdash \{a_i\} \mathbb{I}_i \quad \forall f_i}{\Psi_{IN} \vdash \{f_1 \rightsquigarrow \mathbb{I}_1, \dots, f_n \rightsquigarrow \mathbb{I}_n\} : \{f_1 \rightsquigarrow a_1, \dots, f_n \rightsquigarrow a_n\}} \text{ (CDHP)}$                                                                                                                                        |                              |
| $\frac{\Psi_{IN1} \vdash \mathbb{C}_1 : \Psi_1 \quad \Psi_{IN2} \vdash \mathbb{C}_2 : \Psi_2 \quad \Psi_{IN1}(f) = \Psi_{IN2}(f) \quad \text{dom}(\mathbb{C}_1) \cap \text{dom}(\mathbb{C}_2) = \emptyset \quad \forall f \in \text{dom}(\Psi_{IN1}) \cap \text{dom}(\Psi_{IN2})}{\Psi_{IN1} \cup \Psi_{IN2} \vdash \mathbb{C}_1 \cup \mathbb{C}_2 : \Psi_1 \cup \Psi_2} \text{ (LINK)}$ |                              |
| $\boxed{\Psi \vdash \{a\} \mathbb{I}}$ <b>(Well-formed Instruction Sequence)</b>                                                                                                                                                                                                                                                                                                         |                              |
| $\frac{a \Rightarrow_C a' \quad \Psi \vdash \{a'\} \mathbb{I} \quad c \in \{\text{add, addi, mov, movi, alloc, free, ld, st}\}}{\Psi \vdash \{a\} c; \mathbb{I}} \text{ (SEQ)}$                                                                                                                                                                                                          |                              |
| $\frac{a \Rightarrow \Psi(f) \quad f \in \text{dom}(\Psi)}{\Psi \vdash \{a\} \text{ jd } f} \text{ (JD)}$                                                                                                                                                                                                                                                                                |                              |
| $\frac{(\lambda(\mathbb{H}, \mathbb{R}). \mathbb{R}(r_s) \leq i \wedge a(\mathbb{H}, \mathbb{R})) \Rightarrow a' \quad \Psi \vdash \{a'\} \mathbb{I} \quad (\lambda(\mathbb{H}, \mathbb{R}). \mathbb{R}(r_s) > i \wedge a(\mathbb{H}, \mathbb{R})) \Rightarrow \Psi(f) \quad f \in \text{dom}(\Psi)}{\Psi \vdash \{a\} \text{ bgti } r_s, i, f; \mathbb{I}} \text{ (BGTI)}$              |                              |
| $\frac{a \Rightarrow (\lambda(\mathbb{H}, \mathbb{R}). a'(\mathbb{H}, \mathbb{R}) \wedge \mathbb{R}(r) \in \text{dom}(\Psi) \wedge \Psi(\mathbb{R}(r)) = a')}{\Psi \vdash \{a\} \text{ jmp } r} \text{ (JMP)}$                                                                                                                                                                           |                              |

Figure 2.5: Inference rules of CAP

the inference rules for well-formed code heaps from the original CAP. A module is defined as a small code heap which can contain as few as one code block. Each module is associated with an “import”  $\Psi_{IN}$  interface and an “export” interface  $\Psi$ . A programmer first establishes well-formedness of each individual module via the `CDHP` rule. Two non-conflicting modules can then be linked together via the `LINK` rule. All code blocks will eventually be linked together to form a single global code heap with specification  $\Psi_G$  (which is used in the well-formed program rule). These two code heap rules provide basic support for modular verification. However, as we will show in the next Chapter, modularity breaks down when we reason about first-class code pointers passing across the module’s boundary.

The intuition behind well-formed instruction sequence judgment is that if the instruction sequence  $\mathbb{I}$  starts execution in a machine state which satisfies assertion  $a$ , then executing  $\mathbb{I}$  is safe with respect to the specification  $\Psi$ . An instruction sequence preceded by  $c$  is safe (rule `SEQ`) if we can find another assertion  $a'$  which serves both as the post-condition of  $c$ , (that is,  $a'$  holds on the updated machine state after executing  $c$ ), and as the precondition of the tail instruction sequence.

A direct jump is safe (rule `JD`) if the current assertion can imply the precondition of the target code block as specified in  $\Psi$ . An indirect jump is similar (rule `JMP`) except that it refers to the register file for the target code label; unfortunately, this treatment of first class code pointers requires reasoning about global control flow and breaks the modularity (see the next chapter).

A programmer’s task, when proving the well-formedness of a code block, involves mostly applying the appropriate inference rules, finding intermediate assertions like  $a'$ , and proving all the assertion subsumption relations (which are implemented as logical implications in the mechanized meta logic).

**Soundness.** The soundness theorem below guarantees that given a well-formed CAP program, starting with its current instruction sequence, the machine will never get stuck.

### Theorem 2.1 (CAP Soundness)

If  $\Psi_G \vdash \{a\} \mathbb{P}$ , then for all natural number  $n$ , there exists a program  $\mathbb{P}'$  such that  $\mathbb{P} \xrightarrow{n} \mathbb{P}'$ .

The proof for this theorem can be established following the syntactic approach of proving type soundness [59] by proving the progress and preservation lemmas.

### Lemma 2.2 (CAP Progress)

If  $\Psi_G \vdash \{a\} \mathbb{P}$ , then there exists a program  $\mathbb{P}'$  such that  $\mathbb{P} \xrightarrow{} \mathbb{P}'$ .

### Lemma 2.3 (CAP Preservation)

If  $\Psi_G \vdash \{a\} \mathbb{P}$  and  $\mathbb{P} \xrightarrow{} \mathbb{P}'$  then there exists an assertion  $a'$  such that  $\Psi_G \vdash \{a'\} \mathbb{P}'$ .

In order to address the new `LINK` rule, we need the following code heap typing lemma, whose proof needs the instruction sequence weakening lemma below.

### Lemma 2.4 (CAP Code Heap Typing)

If  $\Psi_{IN} \vdash \mathbb{C} : \Psi$  and  $f \in \text{dom}(\Psi)$ , then  $f \in \text{dom}(\mathbb{C})$  and  $\Psi_{IN} \vdash \{\Psi(f)\} \mathbb{C}(f)$ .

### Lemma 2.5 (CAP Instruction Sequence Weakening)

If  $\Psi \vdash \{a\} \mathbb{I}$ ,  $\Psi \subseteq \Psi'$ , and  $a' \Rightarrow a$  then  $\Psi' \vdash \{a'\} \mathbb{I}$ .

Yu *et al* [63, 64] have also shown that CAP can be easily extended to prove more general safety properties by introducing invariant assertions into the inference rules. Furthermore, by mechanizing the CAP inference rules and the soundness proofs in Coq, we can easily construct FPCC packages for CAP programs [63, 28].

There have been many works following the CAP framework: dynamic storage allocation [63], interfacing with TAL [28], concurrent verification [64, 20], support of embedded code pointers [47] (to be discussed in this dissertation), stack-based control abstraction [21], and open framework for interoperation [19].

## Chapter 3

# Embedded Code Pointers and the Basic XCAP Framework

In this chapter we present an important problem with CAP and other logic-based verification methods: embedded code pointers (ECP). Being a crucial concept for programming, the lack of modular and expressive ECP support prevents these methods from being used to certify realistic system kernel code. We then present our solution for the ECP problem, a new XCAP framework evolved from CAP. We first discuss the idea of “extended propositions,” which can be roughly viewed as a mixture of logic formulas and syntactic constructs. We then discuss the basic structure of the new XCAP framework and show how to use syntactic techniques to perform modular reasoning on ECPs while still retaining the expressive power of Hoare logic. XCAP shares the same target machine TM (see Figure 2.1 and 2.2) with CAP.

XCAP provides the first simple and general solution to the ECP problem for Hoare-logic verification systems. Here, by “simple,” we mean that our method does not alter the structure of Hoare-style program derivations and assertions. By “general”, we mean that our technique can truly handle all kinds of machine-level ECPs, including those hidden inside higher-order closures (together with next chapter).

### 3.1 Embedded Code Pointers

ECPs, based on context and time, are often referred to as computed-gotos, stored procedures, higher-order functions, indirect jumps, continuation pointers, and so on. As the variations of its name suggest, ECP has long been an extensively used concept in programming. Because of the ECP problem, PCC systems based on Hoare logic have to either avoid supporting indirect jumps [44, 64], limit the assertions (for ECPs) to types only [28], sacrifice the modularity by requiring whole-program reasoning [63], or resort to construction of complex semantic models [9, 2]. In Reynolds [54], supporting ECPs is listed as one of the main open problems for separation logic.

At the assembly level (as in TM), ECPs denote those memory addresses (labels) stored in registers or memory cells, pointing to the start of code blocks. A special kind of ECPs is the function return addresses; more general kinds of ECPs can also be found in closures. Supporting ECPs is an essential part of assembly code verification. To understand better the ECP problem, let's take a look at the following example:

```
f1: mov r1, f1    // no assumption
    jd  f2

f2: jmp r1        // r1 stores an ECP with no assumption
```

Here we have defined two assembly code blocks. The first block, labeled  $f_1$ , makes no assumption about the state; it simply moves the code label  $f_1$  into register  $r_1$  and then directly jumps to the other code block labeled  $f_2$ . The  $f_2$  block requires that upon entering, register  $r_1$  must contain a code pointer that has no assumption about the state; it simply makes an indirect jump to this embedded code pointer and continues execution from there. If the execution initially starts from  $f_1$ , the machine will loop forever between the two code blocks and never get stuck. It is important to note that both code blocks are independently written so they are expected to not only just work with each other, but with any other code satisfying their specifications as well.

We introduce a predicate,  $\text{cptr}(f, a)$ , to state that value  $f$  is a *valid* code pointer with precondition  $a$ . Following the notations in Section 2.3, we define the “no assumption”

assertion as

$$a_T \triangleq \lambda S. \text{True}.$$

The preconditions for  $f_1$  and  $f_2$  can be written as

$$a_1 \triangleq a_T$$

and

$$a_2 \triangleq \lambda (H, R). a_T (H, R) \wedge \text{cptr}(R(r_1), a_T).$$

But what should be the definition of  $\text{cptr}(f, a)$ ?

**Semantic approach.** The semantic approach to this problem is to directly internalize the Hoare derivations as part of the assertion language and to define  $\text{cptr}(f, a)$  as valid if there exists a Hoare derivation for the code block at  $f$  with precondition  $a$ . Using the notation from CAP, it can be informally written as:

$$\text{cptr}(f, a) \triangleq \Psi \vdash \{a\} C(f).$$

This is clearly ill-formed since  $\Psi$  is not defined anywhere and it can not be treated as a parameter of the  $\text{cptr}$  predicate—the assertion  $a$ , which is used to form  $\Psi$ , may refer to the  $\text{cptr}$  predicate again.

**Semantic approach: stratification.** To break the circularity, one approach [49, 42] is to internalize Hoare-logic derivations as part of the assertion language and stratify all ECPs (as well as the code heaps and specifications that contain them) so that only the  $\text{cptr}$  definitions (and well-formedness proofs) for highly-ranked code blocks can refer to those of lower-ranked ones. More specifically, a first order code pointer does not specify any ECP in its precondition (its code does not make any indirect jump) so we can define  $\text{cptr}$  over them first; an  $n$ -th order code pointer can only refer to those lower-order ECPs so we can define  $\text{cptr}$  inductively following the same order. The  $\text{cptr}$  predicate definition becomes:

$$\text{cptr}(f, a, k) \triangleq \Psi_{k-1} \vdash \{a\} \mathbb{C}_{k-1}(f).$$

While the typing structures of well-formed instruction sequences would look like below.

$$\frac{\begin{array}{c} \Psi_0 \vdash \{a_0\} \dots ; \text{jd} \dots \\ \vdots \\ \dots \\ \Psi_{n-1} \vdash \{a_{n-1}\} \dots ; \text{jmp} \dots \end{array}}{\Psi_n \vdash \{a_n\} \dots ; \text{jmp} \dots}$$

Stratification works for monomorphic languages with simple procedure parameters [49, 42], however, for machine-level programs where ECPs can appear in any part of the memory, tracking the orders of ECPs is impossible: ECPs can appear on the stack or in another function's closure (often hidden because closures are usually existentially quantified [39]).

**Semantic approach: indexing.** Another approach, by Appel *et al* [9, 2, 56], also introduces an parameter  $k$  to the  $\text{cptr}$  predicate. Instead of letting  $k$  refer to the depth of nesting ECPs as the stratification approach does, the “index”  $k$  now refers to the maximum number of safe future computation steps. Roughly speaking,  $\text{cptr}(f, a, k)$  means that it is “safe” to execute the next (at most)  $k-1$  instructions, starting from the code block at  $f$  with precondition  $a$ .

$$\text{cptr}(f, a, k) \triangleq \forall i < k. \Psi \vdash_i \{a\} \mathbb{C}(1)$$

Note that here  $a$  and  $\Psi$  are **not** defined as state predicates and mapping from code labels to them. Instead, indexing must also be done for assertions and all the Hoare inference rules. For example, the indirect jump rule would have the following shape:

$$\frac{a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}). a'(\mathbb{H}, \mathbb{R}) \wedge \text{cptr}(\mathbb{R}(r), a', k-1)}{\Psi \vdash_k \{a\} \text{jmp } r}.$$

Indexed assertions can only describe safety properties of finite steps. To establish safety properties about infinite future executions, one needs to do induction over the index. Because of the pervasive uses of indices everywhere, indexing dramatically alters

the structure of Hoare-logic program derivations and assertions. This makes it hard to use together with other extensions such as separation logic [54] and rely-guarantee-based reasoning [64, 20].

**Syntactic approach.** Rather than using the semantic methods, CAP takes a syntactic approach and is essentially reasoning about the control flow. Validity of ECPs is established in two steps. In the first step, in indirect jump rule `JMP` it only requires that we look up the assertion for the target code label stored in register  $r$  from the code heap specification  $\Psi$ . (The equality of assertions used here is the Coq equality `eq` which is equivalent to the Leibniz' equality.)

$$\frac{a \Rightarrow (\lambda(H, R). a' (H, R) \wedge R(r) \in \text{dom}(\Psi) \wedge \Psi(R(r)) = a')}{\Psi \vdash \{a\} \text{jmp } r}.$$

Then in the top-level `PROG` rule the well-formedness of global code heap is checked  $(\Psi_G \vdash \mathbb{C} : \Psi_G)$  to make sure that every assertion stored in  $\Psi_G$  (which is the union of all local  $\Psi$ ) is indeed a valid precondition for the corresponding code block. The effect of these two steps, combined together, guarantees that the Hoare derivations internalized in the semantic approach is still obtainable in the syntactic approach for the preservation of the whole program. This approach is often used by type systems such as TAL.

But how do we know that such label indeed falls into the domain of  $\Psi$ ? We reason about the control flow. Take the code block of  $f_2$  for example, we need to prove:

$$a_2 \Rightarrow (\lambda(H, R). a' (H, R) \wedge R(r_1) \in \text{dom}(\Psi) \wedge \Psi(R(r_1)) = a')$$

which unfolds into

$$\forall H, R. (a_T (H, R) \wedge \text{cptr}(R(r_1), a_T)) \supset (a' (H, R) \wedge R(r_1) \in \text{dom}(\Psi) \wedge \Psi(R(r_1)) = a').$$

Clearly we should require the following to hold:

$$\forall H, R. \text{cptr}(R(r_1), a_T) \supset (R(r_1) \in \text{dom}(\Psi) \wedge \Psi(R(r_1)) = a').$$

If we let the `cptr` predicate directly refer to  $\Psi$  in its definition, assertion and specification become cyclic definitions since  $\Psi$  consists of a mapping from code labels to assertions (which can contain `cptr`).

Previous CAP implementation [63] transforms the above formula into:

$$\forall \mathbb{H}, \mathbb{R}. \text{cptr}(\mathbb{R}(r_1), a_r) \supset \mathbb{R}(r_1) \in \{ f \mid f \in \text{dom}(\Psi) \wedge \Psi(f) = a_r \}$$

and statically calculates the address set on the right side using the global code heap specification  $\Psi_g$ . It can then define:

$$\text{cptr}(f, a) \triangleq f \in \{ f' \mid f' \in \text{dom}(\Psi_g) \wedge \Psi_g(f') = a \}.$$

This is clearly not satisfactory as the actual definition of  $\text{cptr}(f, a)$  no longer refers to  $a$ ! Instead, it will be in the form of  $f \in \{f_1, \dots, f_n\}$ , which is not modular and very hard to reason about.

Taking the modularity issue more seriously, from the `JMP` rule again, we notice that all ECPs it can jump to are those contained in the current local  $\Psi$  only. Since the CAP language presented in previous section supports separate verification through linking rule `LINK`, when checking each module's code heap, we do not have the global specification  $\Psi_g$  and only have the declared import interface  $\Psi_{in}$ . For our example, checking the code blocks under different organizations of modules would result in different  $a_2$  assertions:

$$\lambda(\mathbb{H}, \mathbb{R}). \mathbb{R}(r_1) \in \{f_1\} \quad \text{when } f_1 \text{ and } f_2 \text{ are in a same module}$$

and there is no other block with precondition  $a_r$  in it;

$$\lambda(\mathbb{H}, \mathbb{R}). \mathbb{R}(r_1) \in \{f_1, f_3\} \quad \text{when } f_1 \text{ and } f_2 \text{ are in a same module}$$

and there is a block  $f_3$  also with precondition  $a_r$  in it;

$$\lambda(\mathbb{H}, \mathbb{R}). \mathbb{R}(r_1) \in \{\} \quad \text{when } f_1 \text{ and } f_2 \text{ are not in a same module}$$

and there is no other block with precondition  $a_r$  in  $f_2$  module;

$$\lambda(\mathbb{H}, \mathbb{R}). \mathbb{R}(r_1) \in \{f_3\} \quad \text{when } f_1 \text{ and } f_2 \text{ are not in a same module}$$

and there is a block  $f_3$  also with precondition  $a_r$  in  $f_2$  module.

Since we usually do not know the target addresses statically, we cannot put all possible indirect code pointers into the import specification  $\Psi_{IN}$ . The syntactic approach used by CAP cannot support general ECPs without resorting to the whole-program analysis. This greatly limits CAP’s modularity and expressive power.

**Challenges.** Given the ECP problem explained so far, we can summarize some important criteria for evaluating its possible solutions:

- Is it expressive? The new system should retain all expressive power from CAP. In particular, we still want to write assertions as general logic predicates. Ideally, there should be a “type-preserving” translation from CAP into the new system.
- Is it easy to specify? The specifications should be self-explanatory and can be used to reason about safety properties directly. There should be no need of translating ECP specifications into less informative forms such as indexed assertions or addresses sets as found in approaches aforementioned.
- Is it modular? The specifications should be independently writable and ECPs can be freely passed across the modular boundaries.
- Is it simple? The approach should better not involve overwhelming efforts in design and implementation when compared to CAP. It should not alter or pollute the basic structure of Hoare-style program derivations and assertions.
- Can it support extensions easily? The approach should work smoothly with common language features and popular extensions to Hoare logic.

## 3.2 Extended Propositions

To avoid the “circular specification” problem in the syntactic approach (to ECP), we break the loop by adding a small amount of syntax to the assertion language, and then split the syntax of assertions from their “meanings” (or validities). The key idea here is to

delay the checking of the well-formedness property of ECPs. Instead of checking them individually at the instruction sequence level using locally available specification  $\Psi_{IN}$ , we collect all these checks into one global condition that would only need to be established with respect to the global code heap specification  $\Psi_G$  when all code are finally linked together before execution.

Basically `cptr` is now merely a syntactic constant and can appear in any assertion at any time in the form of `cptr(f, a)`. Its meaning (or validity) is not revealed during the verification of local modules (*i.e.*, the well-formed instruction sequence rules). An “interpretation” will translate the ECP assertion syntax into its meaning (as a meta proposition) when the global code heap specification  $\Psi_G$  is finally available in the top-level `PROG` rule. The `PROG` rule will then complete the well-formedness check for the whole program. We achieve modularity through this two-stage verification structure.

Note that requiring the global specification  $\Psi_G$  in the `PROG` rule does not break any modularity, since at runtime all code (and their specifications) must be made available before they can be executed. Besides, the top-level rule `PROG` only needs to be validated once for the initial machine configuration.

**Extended propositions.** Figure 3.1 defines *extended logical propositions* ( $PropX$ ).  $PropX$  can be viewed as a lifted version of the meta logic propositions, extended with an additional `cptr` constant: the base case  $\langle p \rangle$  is just the lifted proposition  $p$  (thus  $PropX$  retains the full expressive power of meta logic propositions); `cptr` is the constructor for specifying ECP propositions. To interoperate lifted propositions and ECP propositions, we also lift all the logical connectives and quantifiers.

For the universal and existential quantifications, we use higher-order abstract syntax (HOAS) [52] to represent them. For example,  $\forall x:A.P$  is actually implemented as  $\forall(\lambda x:A.P)$ . The benefit here is that we can utilize the full expressive power of the mechanized meta logic (CiC/Coq) and use a single quantifier to quantify over all possible types ( $A$ ) such as *Prop*, *State*, and even *State*  $\rightarrow$  *Prop*.

|                                               |                                   |
|-----------------------------------------------|-----------------------------------|
| $(PropX) \ P, Q ::= \langle p \rangle$        | <i>lifted meta proposition</i>    |
| $\text{cptr}(f, a)$                           | <i>embedded code pointer</i>      |
| $P \wedge Q$                                  | <i>conjunction</i>                |
| $P \vee Q$                                    | <i>disjunction</i>                |
| $P \rightarrow Q$                             | <i>implication</i>                |
| $\forall x:A. P$                              | <i>universal quantification</i>   |
| $\exists x:A. P$                              | <i>existential quantification</i> |
| $(Assertion) \ a \in State \rightarrow PropX$ |                                   |

Figure 3.1: Extended propositions

$$\begin{aligned}
 \llbracket \langle p \rangle \rrbracket_\Psi &\triangleq p \\
 \llbracket \text{cptr}(f, a) \rrbracket_\Psi &\triangleq f \in \text{dom}(\Psi) \wedge \Psi(f) = a \\
 \llbracket P \wedge Q \rrbracket_\Psi &\triangleq \llbracket P \rrbracket_\Psi \wedge \llbracket Q \rrbracket_\Psi \\
 \llbracket P \vee Q \rrbracket_\Psi &\triangleq \llbracket P \rrbracket_\Psi \vee \llbracket Q \rrbracket_\Psi \\
 \llbracket P \rightarrow Q \rrbracket_\Psi &\triangleq \llbracket P \rrbracket_\Psi \supset \llbracket Q \rrbracket_\Psi \\
 \llbracket \forall x:A. P \rrbracket_\Psi &\triangleq \forall B:A. \llbracket P[B/x] \rrbracket_\Psi \\
 \llbracket \exists x:A. P \rrbracket_\Psi &\triangleq \exists B:A. \llbracket P[B/x] \rrbracket_\Psi
 \end{aligned}$$

Figure 3.2: Interpretations of extended propositions

Extended propositions can be used to construct “extended” predicates using the abstraction facility in the underlying meta logic. For example, the following extended state predicate of type  $State \rightarrow PropX$  says registers  $r_1$  and  $r_2$  store the same value, while  $r_3$  stores a non-NULL value:

$$\lambda(H, R). \langle R(r_1) = R(r_2) \wedge R(r_2) \neq \text{NULL} \rangle.$$

Extended predicates are not limited to be over machine states only. For example, the following extended value predicate resembles the code pointer type found in type systems. (Here,  $a$ , the precondition of the code pointed to by  $f$ , is an extended state predicate.)

$$\text{code } a \triangleq \lambda f. \text{cptr}(f, a)$$

Extended propositions adds a thin layer of syntax over meta propositions. Figure 3.2

presents an **interpretation** of their validity in meta logic. It is defined as a meta function. A lifted proposition  $\langle p \rangle$  is valid if  $p$  is valid in the meta logic. Validity of ECP propositions can only be testified with a code heap specification  $\Psi$  (formally defined in the next section), so we make it a parameter of the interpretation function;  $\Psi$  is typically instantiated by (but not limited to) the global specification  $\Psi_G$ . Interpretation of  $\text{cptr}(f, a)$  tests the equality of  $a$  with  $\Psi(f)$ . Here we use the inductively defined Coq equality  $\text{eq}$  (equivalent to the Leibniz' equality) extended with extensionality of functions (a safe extension commonly used in the Coq [30] community). Note that the use of equality predicate in logic is different from the use of equality function in programming—a programmer must supply the proper equality proofs in order to satisfy the ECP interpretation. Extended logical connectives and quantifiers are interpreted in a straight-forward way. Note that HOAS [52] is used in the interpretation of extended implications.

Interpretation of assertions can be trivially defined as

$$[\![a]\!]_\Psi \triangleq \lambda S. [\![a S]\!]_\Psi$$

which turns “syntactic” program specifications in extended propositions into “real” program invariants in the mechanized meta logic.

### 3.3 The XCAP Framework

In this section, we present a new XCAP framework, which is based on CAP and uses  $Prop_X$  as its assertion language. We refer readers to Section 2.3 for more details on the CAP framework and only highlight the difference between XCAP and CAP in this section.

**Assertion Language.** We present the assertion language of XCAP in Figure 3.3. The only difference between it and Figure 2.4 is that assertions ( $a$ ) are now defined as extended state predicates. We also define code heap specifications ( $\Psi$ ) and the assertion subsumption relation ( $\Rightarrow$ ) accordingly. Note that assertions are first turned into meta logical interpretations before doing subsumptions.

$$\begin{aligned}
(CdHpSpec) \quad \Psi &::= \{f \rightsquigarrow a\}^* \\
( Assertion) \quad a &\in State \rightarrow PropX \\
( AssertImp) \quad a \Rightarrow a' &\triangleq \forall \Psi, S. [[a]]_\Psi S \supset [[a']]_\Psi S \\
( StepImp) \quad a \Rightarrow_C a' &\triangleq \forall \Psi, S. [[a]]_\Psi S \supset [[a']]_\Psi \text{Next}_C(S)
\end{aligned}$$

Figure 3.3: Assertion language of XCAP

**Inference rules.** To reason about TM programs in XCAP, just as we did for CAP in Figure 2.5, we present a similar set of inference rules for well-formed programs, code heaps, and instruction sequences in Figure 3.4. Other than the differences in the assertion language, we only modified the PROG rule and the JMP rule, and added a new ECP rule for introducing new ECP propositions into assertions on the fly. For the unchanged rules, readers can refer to Section 2.3 for explanations.

The change for the PROG rule is minor but important. Here we use the global specification  $\Psi_G$  in the interpretation of assertions which may contain ECP propositions. For state  $S$  to satisfy assertion  $a$ , we require a proof for the meta proposition  $([[a]]_{\Psi_G} S)$ . This is the only place in the XCAP inference rules where validity of assertions (with ECP propositions) needs to be established. Other rules only require subsumption between assertions.

For the JMP rule, instead of looking up the target code blocks' preconditions  $a'$  from the current (local) specification  $\Psi$ , we require the current precondition  $a$  to guarantee that the target code label  $R(r)$  is a valid ECP with  $a'$  as its precondition. Combined with the  $([[a]]_{\Psi_G} S)$  condition established in the PROG rule, we can deduce that  $a'$  is indeed the one specified in  $\Psi_G$ .

If we call the JMP rule “consumer” of ECP propositions, then the ECP rule can be called “producer.” It is essentially a “cast” rule—it allows us to introduce new ECP propositions  $\text{cptr}(f, \Psi(f))$  about any code label  $f$  found in the current code heap specification  $\Psi$  into the new assertion  $a'$ . This rule is often used when we move a constant code label into a register to create an ECP.

$\boxed{\Psi_G \vdash \{a\} \mathbb{P}}$  (Well-formed Program)

$$\frac{\Psi_G \vdash \mathbb{C} : \Psi_G \quad (\llbracket a \rrbracket_{\Psi_G} \mathbb{S}) \quad \Psi_G \vdash \{a\} \mathbb{I}}{\Psi_G \vdash \{a\} (\mathbb{C}, \mathbb{S}, \mathbb{I})} \text{ (PROG)}$$

$\boxed{\Psi_{IN} \vdash \mathbb{C} : \Psi}$  (Well-formed Code Heap)

$$\frac{\Psi_{IN} \vdash \{a_i\} \mathbb{I}_i \quad \forall f_i}{\Psi_{IN} \vdash \{f_1 \rightsquigarrow \mathbb{I}_1, \dots, f_n \rightsquigarrow \mathbb{I}_n\} : \{f_1 \rightsquigarrow a_1, \dots, f_n \rightsquigarrow a_n\}} \text{ (CDHP)}$$

$$\frac{\Psi_{IN1} \vdash \mathbb{C}_1 : \Psi_1 \quad \Psi_{IN2} \vdash \mathbb{C}_2 : \Psi_2 \quad \Psi_{IN1}(f) = \Psi_{IN2}(f) \quad \text{dom}(\mathbb{C}_1) \cap \text{dom}(\mathbb{C}_2) = \emptyset \quad \forall f \in \text{dom}(\Psi_{IN1}) \cap \text{dom}(\Psi_{IN2})}{\Psi_{IN1} \cup \Psi_{IN2} \vdash \mathbb{C}_1 \cup \mathbb{C}_2 : \Psi_1 \cup \Psi_2} \text{ (LINK)}$$

$\boxed{\Psi \vdash \{a\} \mathbb{I}}$  (Well-formed Instruction Sequence)

$$\frac{a \Rightarrow_C a' \quad \Psi \vdash \{a'\} \mathbb{I} \quad c \in \{\text{add, addi, mov, movi, alloc, free, ld, st}\}}{\Psi \vdash \{a\} c; \mathbb{I}} \text{ (SEQ)}$$

$$\frac{a \Rightarrow \Psi(f) \quad f \in \text{dom}(\Psi)}{\Psi \vdash \{a\} \text{ jd } f} \text{ (JD)}$$

$$\frac{(\lambda(\mathbb{H}, \mathbb{R}). \langle \mathbb{R}(r_s) \leq i \rangle \wedge a(\mathbb{H}, \mathbb{R})) \Rightarrow a' \quad \Psi \vdash \{a'\} \mathbb{I} \quad (\lambda(\mathbb{H}, \mathbb{R}). \langle \mathbb{R}(r_s) > i \rangle \wedge a(\mathbb{H}, \mathbb{R})) \Rightarrow \Psi(f) \quad f \in \text{dom}(\Psi)}{\Psi \vdash \{a\} \text{ bgti } r_s, i, f; \mathbb{I}} \text{ (BGTI)}$$

$$\frac{a \Rightarrow (\lambda(\mathbb{H}, \mathbb{R}). a' (\mathbb{H}, \mathbb{R}) \wedge \text{cptr}(\mathbb{R}(r), a'))}{\Psi \vdash \{a\} \text{ jmp } r} \text{ (JMP)}$$

$$\frac{(\lambda \mathbb{S}. \text{cptr}(f, \Psi(f)) \wedge a \mathbb{S}) \Rightarrow a' \quad f \in \text{dom}(\Psi) \quad \Psi \vdash \{a'\} \mathbb{I}}{\Psi \vdash \{a\} \mathbb{I}} \text{ (ECP)}$$

Figure 3.4: Inference rules of XCAP

**Example.** Combining the JMP and ECP rules, ECP knowledge can be built up by one module at the time of ECP creation and be passed around and get used for indirect jumps in other modules. For example, given the following code where we assume register  $r_{30}$  contains the return value and register  $r_{31}$  contains the return code pointer:

```

plus: add r30, r0, r1;    // fun plus (a, b) = a + b
      jmp r31

app2: mov r3, r0;          // fun app2(f, a, b) = f(a, b)
      mov r0, r1;
      mov r1, r2;
      jmp r3
  
```

we can assign them with the following XCAP specifications and make safe (higher-order) function calls such as  $app2(plus,1,2)$ .

$$\begin{aligned}
\{ \text{plus} \rightsquigarrow \lambda(\mathbb{H}, \mathbb{R}). \exists a, b, ret. \\
& \langle \mathbb{R}(r_0) = a \wedge \mathbb{R}(r_1) = b \wedge \mathbb{R}(r_{31}) = ret \rangle \\
& \wedge \text{cptr}(ret, \lambda(\mathbb{H}', \mathbb{R}'). \langle \mathbb{R}'(r_{30}) = a + b \rangle) \} \\
\\
\{ \text{app2} \rightsquigarrow \lambda(\mathbb{H}, \mathbb{R}). \exists f, a, b, ret. \\
& \langle \mathbb{R}(r_1) = a \wedge \mathbb{R}(r_2) = b \wedge \mathbb{R}(r_0) = f \wedge \mathbb{R}(r_{31}) = ret \rangle \\
& \wedge \text{cptr}(f, \lambda(\mathbb{H}', \mathbb{R}'). \exists a', b', ret'. \\
& \quad \langle \mathbb{R}'(r_0) = a' \wedge \mathbb{R}'(r_1) = b' \wedge \mathbb{R}'(r_{31}) = ret' \rangle \\
& \quad \wedge \text{cptr}(ret', \lambda(\mathbb{H}'', \mathbb{R}''). \langle \mathbb{R}''(r_{30}) = a' + b' \rangle) \} \\
& \wedge \text{cptr}(ret, \lambda(\mathbb{H}', \mathbb{R}'). \langle \mathbb{R}'(r_{30}) = a + b \rangle) \}
 \end{aligned}$$

**Soundness.** The soundness of XCAP is proved in the same way as we did for CAP. We give the main lemmas and a proof sketch here. More details can be found in Appendix B.

### Lemma 3.1 (XCAP Progress)

If  $\Psi_G \vdash \{a\} \mathbb{P}$ , then there exists a program  $\mathbb{P}'$  such that  $\mathbb{P} \longrightarrow \mathbb{P}'$ .

**Proof Sketch:** Suppose  $\mathbb{P} = (\mathbb{C}, \mathbb{S}, \mathbb{I})$ , by inversion we obtain  $\Psi_G \vdash \{a\} \mathbb{I}$ . The proof is by induction over this derivation. ■

### Lemma 3.2 (XCAP Preservation)

If  $\Psi_G \vdash \{a\} \mathbb{P}$  and  $\mathbb{P} \rightarrow \mathbb{P}'$  then there exists an assertion  $a'$  such that  $\Psi_G \vdash \{a'\} \mathbb{P}'$ .

**Proof Sketch:** Suppose  $\mathbb{P} = (\mathbb{C}, \mathbb{S}, \mathbb{I})$ ; by inversion we obtain  $\Psi_G \vdash \mathbb{C} : \Psi_G$ ,  $(\llbracket a \rrbracket_{\Psi_G} \mathbb{S})$ , and  $\Psi_G \vdash \{a\} \mathbb{I}$ . We do induction over derivation  $\Psi_G \vdash \{a\} \mathbb{I}$ . The only interesting cases are the JMP and ECP rules.

For the JMP rule case, let  $\mathbb{S}$  be  $(\mathbb{H}, \mathbb{R})$ . By the implication  $a \Rightarrow (\lambda(\mathbb{H}, \mathbb{R}).\text{cptr}(\mathbb{R}(r), a'))$  and the interpretation of cptr it follows that  $a' = \Psi_G(\mathbb{R}(r))$  and  $\mathbb{R}(r) \in \text{dom}(\Psi_G)$ . Then by the same code heap typing lemma as discussed in Section 2.3, it follows that  $\Psi_G \vdash \{a'\} \mathbb{C}(\mathbb{R}(r))$ . Finally by  $a \Rightarrow a'$  it follows that  $a'(\mathbb{H}, \mathbb{R})$ .

For the ECP case, by the code heap typing lemma and by  $(\lambda \mathbb{S}. \text{cptr}(f, \Psi_G(f)) \wedge a \mathbb{S}) \Rightarrow a'$  it follows that  $\llbracket a' \mathbb{S} \rrbracket_{\Psi_G}$ . Also we have  $\Psi_G \vdash \{a'\} \mathbb{I}$ . Then we use the induction hypothesis to finish the proof.  $\blacksquare$

### Theorem 3.3 (XCAP Soundness)

If  $\Psi_G \vdash \{a\} \mathbb{P}$ , then for all natural number  $n$ , there exists a program  $\mathbb{P}'$  such that  $\mathbb{P} \rightarrow^n \mathbb{P}'$ .

## 3.4 Discussion

The main idea of XCAP is to support Hoare-style reasoning of ECPs by extending the assertion language with a thin layer of syntax. Next we review XCAP using the criteria given in Section 3.1. The following theorem presents a simple “type-preserving” translation from CAP to XCAP and shows that XCAP is at least as powerful as CAP. To avoid confusion, we use  $\vdash_{CAP}$  and  $\vdash_{XCAP}$  to represent CAP judgments (as defined in Figure 2.4) and XCAP ones (as defined in Figure 3.4). See our implementation for detailed proofs.

### Theorem 3.4 (CAP to XCAP Translation)

We define the lifting of CAP assertions and specifications as:

$$\lceil a \rceil \triangleq \lambda \mathbb{S}. \langle a \mathbb{S} \rangle$$

$$\text{and } \lceil \{f_1 \rightsquigarrow a_1, \dots, f_n \rightsquigarrow a_n\} \rceil \triangleq \{f_1 \rightsquigarrow \lceil a_1 \rceil, \dots, f_n \rightsquigarrow \lceil a_n \rceil\}.$$

1. If  $\Psi_G \vdash_{CAP} \{a\} \mathbb{P}$  then  $\lceil \Psi_G \rceil \vdash_{XCAP} \{\lceil a \rceil\} \mathbb{P}$ ;
2. if  $\Psi_{IN} \vdash_{CAP} C : \Psi$  then  $\lceil \Psi_{IN} \rceil \vdash_{XCAP} \{C\} \lceil \Psi \rceil$ ;
3. if  $\Psi \vdash_{CAP} \{a\} \mathbb{I}$  then  $\lceil \Psi \rceil \vdash_{XCAP} \{\lceil a \rceil\} \mathbb{I}$ .

**Proof Sketch:** (1) and (2) are straight forward and based on (3). For (3), as CAP’s JMP rule only reasons about preconditions in the current  $\Psi$ , we use ECP rule on all pairs of code label and precondition in  $\Psi$  and use XCAP’s JMP rule to finish the proof. ■

The specification written in XCAP assertion language is close to a typical Hoare assertion and thus is easy to write and reason about. From a user perspective, there is no need to worry about the “meaning” of the ECP propositions because they are treated abstractly almost all the time.

XCAP is still lightweight because the lifted propositions  $\langle p \rangle$  and their reasoning are shallowly embedded into the meta logic, which is the same as CAP. The added component of ECP propositions as well as other lifted connectives and quantifiers are simple syntactic constructs and do not involve complex constructions.

As we will show in later chapters, the XCAP framework can be easily extended to support other language features and popular extensions of Hoare logic.



## Chapter 4

# Impredicative Polymorphisms and Recursive Specifications

The XCAP system presented in the previous chapter enjoys great expressive power from its underlying meta logic. As the mini examples shown before, features such as data polymorphism can be easily supported. However, to support modular verification, when composing specifications, it is important to be able to abstract out and quantify over (part of) the specifications themselves. This is especially important for ECPs, since very often the specification for the target code is only partially disclosed to the callers. We extend the XCAP from the previous chapter to support this kind of impredicative polymorphism.

Using the extended XCAP, we solve the ECP problem for separation logic [54] and present a verification of a destructive list-append function listed as an example in [54].

Recursive specifications are very useful in describing complex invariants. Simple recursive data structures such as link-list are already supported by XCAP. However, for the recursive types  $(\mu\alpha.\tau)$  found in type systems, their counterpart in logic, “recursive predicates”, can not be easily defined in XCAP. We follow the extension for impredicative polymorphisms and extended XCAP to support recursive specifications.

Both the extensions cause very little change in the XCAP inference rules and meta theory. Thus they are light-weight and proved sound.

## 4.1 Impredicative Polymorphisms and Recursive Specifications

Impredicative polymorphisms and recursive types can be easily supported in type systems. Take TAL [41] for example, it allows quantifications over value types, which correspond to value predicates in XCAP. Since XCAP predicates are much more flexible than types, we choose to support universal and existential quantifications over arbitrary extended predicates of type  $A \rightarrow \text{Prop}_X$  (where  $A$  does not contain  $\text{Prop}_X$ ). We reuse the quantifiers defined in XCAP in Section 3.3 and write **impredicative extended propositions** as follows:

$$\forall \alpha : A \rightarrow \text{Prop}_X. P \quad \text{and} \quad \exists \alpha : A \rightarrow \text{Prop}_X. P.$$

In the implementation of these impredicative quantifiers, the HOAS technique used for predicative ones no longer works because of the negative-occurrence restriction for inductive definitions. We use the de Bruijn notations [15] to encode the impredicative quantifiers. For more details, see Appendix 8.

The next task is to find a way to establish the **validity of impredicative extended propositions**. One obvious idea is to take the previously defined interpretation function in Figure 3.2 and directly apply it to the impredicative quantification cases as follows:

$$[\![ \forall \alpha : A \rightarrow \text{Prop}_X. P ]\!]_{\Psi} \triangleq \forall a : A \rightarrow \text{Prop}_X. [\![ P[a/\alpha] ]\!]_{\Psi}$$

$$[\![ \exists \alpha : A \rightarrow \text{Prop}_X. P ]\!]_{\Psi} \triangleq \exists a : A \rightarrow \text{Prop}_X. [\![ P[a/\alpha] ]\!]_{\Psi}$$

Unfortunately (but not surprisingly), the recursive call parameter  $P[a/\alpha]$  may be larger than the original ones as  $a$  can bring in unbounded new sub-formulas. The interpretation function no longer terminates, and thus is not definable in the meta logic.

Our solution is to define the interpretation of extended propositions as the set of inductively defined validity rules shown in Figure 4.1. We define an environment of extended propositions as

$$(env) \quad \Gamma := \cdot \mid \Gamma, P$$

$\Gamma \vdash_{\Psi} P$

*(Validity of Extended Propositions)*

*(The following presentation omits the  $\Psi$  in judgment  $\Gamma \vdash_{\Psi} P$ .)*

$$\begin{array}{c}
 \frac{P \in \Gamma}{\Gamma \vdash P} \text{ (ENV)} \quad \frac{p}{\Gamma \vdash \langle p \rangle} \text{ (}\langle \rangle\text{-I)} \quad \frac{\Gamma \vdash \langle p \rangle \quad p \supset (\Gamma \vdash Q)}{\Gamma \vdash Q} \text{ (}\langle \rangle\text{-E)} \\
 \\ 
 \frac{\Psi(f) = a}{\Gamma \vdash \text{cptr}(f, a)} \text{ (CP-I)} \quad \frac{\Gamma \vdash \text{cptr}(f, a) \quad (\Psi(f) = a) \supset (\Gamma \vdash Q)}{\Gamma \vdash Q} \text{ (CP-E)} \\
 \\ 
 \frac{\Gamma \vdash P \quad \Gamma \vdash Q}{\Gamma \vdash P \& Q} \text{ (}\&\text{-I)} \quad \frac{\Gamma \vdash P \& Q}{\Gamma \vdash P} \text{ (}\&\text{-E1)} \quad \frac{\Gamma \vdash P \& Q}{\Gamma \vdash Q} \text{ (}\&\text{-E2)} \\
 \\ 
 \frac{\Gamma \vdash P}{\Gamma \vdash P \vee Q} \text{ (}\vee\text{-I1)} \quad \frac{\Gamma \vdash Q}{\Gamma \vdash P \vee Q} \text{ (}\vee\text{-I2)} \quad \frac{\Gamma \vdash P \vee Q \quad \Gamma, P \vdash R \quad \Gamma, Q \vdash R}{\Gamma \vdash R} \text{ (}\vee\text{-E)} \\
 \\ 
 \frac{\Gamma, P \vdash Q}{\Gamma \vdash P \rightarrow Q} \text{ (}\rightarrow\text{-I)} \quad \frac{\Gamma \vdash P \rightarrow Q \quad \Gamma \vdash P}{\Gamma \vdash Q} \text{ (}\rightarrow\text{-E)} \\
 \\ 
 \frac{\Gamma \vdash P[B/x] \quad \forall B : A}{\Gamma \vdash \forall x : A. P} \text{ (}\forall\text{-I1)} \quad \frac{\Gamma \vdash \forall x : A. P \quad B : A}{\Gamma \vdash P[B/x]} \text{ (}\forall\text{-E1)} \\
 \\ 
 \frac{B : A \quad \Gamma \vdash P[B/x]}{\Gamma \vdash \exists x : A. P} \text{ (}\exists\text{-I1)} \quad \frac{\Gamma \vdash \exists x : A. P \quad \Gamma, P[B/x] \vdash Q \quad \forall B : A}{\Gamma \vdash Q} \text{ (}\exists\text{-E1)} \\
 \\ 
 \frac{\Gamma \vdash P[a/\alpha] \quad \forall a : A \rightarrow \text{Prop} X}{\Gamma \vdash \forall \alpha : A \rightarrow \text{Prop} X. P} \text{ (}\forall\text{-I2)} \quad \frac{a : A \rightarrow \text{Prop} X \quad \Gamma \vdash P[a/\alpha]}{\Gamma \vdash \exists \alpha : A \rightarrow \text{Prop} X. P} \text{ (}\exists\text{-I2)}
 \end{array}$$

Figure 4.1: Validity rules for impredicative extended propositions

The judgment,  $\Gamma \vdash_{\Psi} P$ , means that  $P$  is valid under environment  $\Gamma$  and code heap specification  $\Psi$ . An extended proposition is valid if it is in the environment. Constructors of extended propositions have their introduction and elimination rules. The introduction rules of lifted proposition  $\langle p \rangle$  and ECP proposition  $\text{cptr}(f, a)$  require that  $p$  and  $\Psi(f) = a$  be valid in the meta logic. Their elimination rules allow full meta-implication power in constructing derivations of validity of the new extended propositions. The rules for other constructors are standard and require little explanation.

The **interpretation of extended propositions** can be now be simply defined as their validity under the empty environment.

$$[\![P]\!]_\Psi \triangleq \cdot \vdash_\Psi P$$

Given the above definitions of interpretation and validity, we have proved the following soundness theorem (with respect to CiC/Coq) using the syntactic normalization proof method by Pfenning [51]. For proof details, see Appendix A.

**Theorem 4.1 (Soundness of *PropX* Interpretation)**

1. If  $[\![\langle p \rangle]\!]_\Psi$  then  $p$ ;
2. if  $[\![\text{cptr}(f, a)]]\!]_\Psi$  then  $\Psi(f) = a$ ;
3. if  $[\![P \wedge Q]\!]_\Psi$  then  $[\![P]\!]_\Psi$  and  $[\![Q]\!]_\Psi$ ;
4. if  $[\![P \vee Q]\!]_\Psi$  then either  $[\![P]\!]_\Psi$  or  $[\![Q]\!]_\Psi$ ;
5. if  $[\![P \rightarrow Q]\!]_\Psi$  and  $[\![P]\!]_\Psi$  then  $[\![Q]\!]_\Psi$ ;
6. if  $[\![\forall x:A. P]\!]_\Psi$  and  $B:A$  then  $[\![P[B/x]]]\!]_\Psi$ ;
7. if  $[\![\exists x:A. P]\!]_\Psi$  then there exists  $B:A$  such that  $[\![P[B/x]]]\!]_\Psi$ ;
8. if  $[\![\forall \alpha:A \rightarrow \text{PropX}. P]\!]_\Psi$  and  $a:A \rightarrow \text{PropX}$  then  $[\![P[a/\alpha]]]\!]_\Psi$ ;
9. if  $[\![\exists \alpha:A \rightarrow \text{PropX}. P]\!]_\Psi$  then there exists  $a:A \rightarrow \text{PropX}$  such that  $[\![P[a/\alpha]]]\!]_\Psi$ .

**Corollary 4.2 (Consistency)**  $[\![\langle \text{False} \rangle]\!]_\Psi$  is not provable.

To make impredicative extended propositions easy to use, in the implementation of *PropX* and its interpretation, we define additional concrete syntax and proof tactics to hide the de Bruijn representation and the interpretation detail. A user can mostly manipulate *PropX* objects in the same way as with *Prop* objects in Coq. See Chapter 8 for more details.

**Inference rules and soundness.** The XCAP indirect jump rule `JMP` from the one presented in Figure 3.4 can now be viewed as:

$$\frac{a \Rightarrow (\lambda(\mathbb{H}, \mathbb{R}). \exists a'. (a'(\mathbb{H}, \mathbb{R}) \wedge \text{cptr}(\mathbb{R}(r), a')))}{\Psi \vdash \{a\} \text{jmp } r} \text{ (JMP).}$$

The existential quantification over the assertion  $a'$  (for the target code block) is moved from begining (implicitly) over the whole rule to being after the assertion subsumption ( $\Rightarrow$ ). This change is important to support polymorphic code—the target assertion  $a'$  can now depend on the current assertion  $a$ .

All other inference rules of XCAP remain unchanged. The soundness of XCAP inference rules (Theorem 3.3) and the CAP to XCAP translation (Theorem 3.4) only need trivial modifications in the case of indirect jump. We do not restate them here.

**Example.** With impredicative quantifications, ECP can now be specified and used with great flexibility. For example, the *app2* function in Section 3.3 can now be assigned with the following more general specification. Instead of being restricted to an argument with the “plus” functionality, any functions that take two arguments  $a$  and  $b$  and return a value satisfying (unrestricted) assertion  $a_{ret}(a, b)$  can be passed to *app2*.

$$\begin{aligned} \{ \text{app2} \rightsquigarrow \lambda(\mathbb{H}, \mathbb{R}). \exists f, a, b, \text{ret}, a_{ret}. \\ \langle \mathbb{R}(r_1) = a \wedge \mathbb{R}(r_2) = b \wedge \mathbb{R}(r_0) = f \wedge \mathbb{R}(r_{31}) = \text{ret} \rangle \\ \wedge \text{cptr}(f, \lambda(\mathbb{H}', \mathbb{R}'). \exists a', b', \text{ret}'. \\ \langle \mathbb{R}'(r_0) = a' \wedge \mathbb{R}'(r_1) = b' \wedge \mathbb{R}'(r_{31}) = \text{ret}' \rangle \\ \wedge \text{cptr}(\text{ret}', a_{ret}(a', b')) \\ \wedge \text{cptr}(\text{ret}, a_{ret}(a, b)) \} \end{aligned}$$

**Subtyping on ECP propositions.** The ECP proposition has a very rigid interpretation. To establish the validity of  $\text{cptr}(f, a)$ ,  $\Psi(f)$  must be “equal” to  $a$ . This is simple for the system, but is restrictive in usage and differs from typical type systems where subtyping can be used to relax code types. With the support of impredicative quantifications, instead of directly using  $\text{cptr}$ , we can define a more flexible predicate for ECPs:

$$\text{codeptr}(f, a) \triangleq \exists a'. (\text{cptr}(f, a') \wedge \forall S. a \rightarrow a' S).$$

We can define the following subtyping lemma for ECP predicates.

**Lemma 4.3 (Subtyping of ECP Propositions)**

If  $\llbracket \text{codeptr}(f, a') \rrbracket_{\Psi}$  and  $\llbracket \forall S. a \ S \rightarrow a' \ S \rrbracket_{\Psi}$  then  $\llbracket \text{codeptr}(f, a) \rrbracket_{\Psi}$ .

**Proof:** From  $\llbracket \text{codeptr}(f, a') \rrbracket_{\Psi}$  it follows that

$$\llbracket \exists a''. (\text{cptr}(f, a'') \wedge \forall S. a' \ S \rightarrow a'' \ S) \rrbracket_{\Psi}.$$

By the soundness of interpretation theorem it follows that

$$\exists a''. \llbracket \text{cptr}(f, a'') \rrbracket_{\Psi} \wedge \llbracket \forall S. a' \ S \rightarrow a'' \ S \rrbracket_{\Psi}.$$

Using the  $\forall\text{-I1}$ ,  $\forall\text{-E1}$ ,  $\rightarrow\text{-I}$ , and  $\rightarrow\text{-E}$  rules it follows that

$$\llbracket \forall S. a \ S \rightarrow a'' \ S \rrbracket_{\Psi}.$$

Using the  $\wedge\text{-I}$  and  $\exists\text{-I2}$  rules it follows that

$$\llbracket \exists a''. (\text{cptr}(f, a'') \wedge \forall S. a \ S \rightarrow a'' \ S) \rrbracket_{\Psi}.$$

Which is  $\llbracket \text{codeptr}(f, a) \rrbracket_{\Psi}$ . ■

## 4.2 Solving Reynolds's ECP Problem

Separation logic [54] is a recent Hoare-logic framework designed for reasoning about shared mutable data structures. Reynolds [54] listed supporting ECPs as a major open problem for separation logic. In this section, we show how to solve this problem within the XCAP framework (with impredicative polymorphisms support).

XCAP directly supports separation logic specifications and reasoning by defining their constructs and inference rules in the assertion language and meta logic as macros and lemmas. For example, the following are some separation logic primitives defined over the data heap (assuming  $\uplus$  is the disjoint union):

$$\begin{aligned}
\mathsf{emp} &\triangleq \lambda \mathbb{H}. \langle \mathsf{dom}(\mathbb{H}) = \{\} \rangle \\
\mathsf{1} \mapsto \mathsf{w} &\triangleq \lambda \mathbb{H}. \langle \mathsf{dom}(\mathbb{H}) = \{1\} \wedge \mathbb{H}(1) = \mathsf{w} \rangle \\
\mathsf{1} \mapsto \_ &\triangleq \lambda \mathbb{H}. \langle \exists \mathsf{w}. (\mathsf{1} \mapsto \mathsf{w} \ \mathbb{H}) \rangle \\
\mathsf{a}_1 * \mathsf{a}_2 &\triangleq \lambda \mathbb{H}. \exists \mathbb{H}_1, \mathbb{H}_2. \langle \mathbb{H}_1 \uplus \mathbb{H}_2 = \mathbb{H} \rangle \wedge \mathsf{a}_1 \ \mathbb{H}_1 \wedge \mathsf{a}_2 \ \mathbb{H}_2 \\
\mathsf{1} \mapsto \mathsf{w}_1, \dots, \mathsf{w}_n &\triangleq \mathsf{1} \mapsto \mathsf{w}_1 * \dots * \mathsf{1} + n - 1 \mapsto \mathsf{w}_n
\end{aligned}$$

The frame rule can be defined as lemmas (derived rules) in XCAP:

$$\begin{array}{c}
\frac{a \Rightarrow (a' \circ \mathsf{Next}_C)}{(a * a'') \Rightarrow ((a' * a'') \circ \mathsf{Next}_C)} \text{ (FRAME-INSTR)} \\
\frac{\{f_1 \rightsquigarrow a_1, \dots, f_n \rightsquigarrow a_n\} \vdash \{a\} \mathbb{I}}{\{f_1 \rightsquigarrow a_1 * a', \dots, f_n \rightsquigarrow a_n * a'\} \vdash \{a * a'\} \mathbb{I}} \text{ (FRAME-ISEQ)}
\end{array}$$

ECP formulas can appear freely in these assertions and rules, thus it is very convenient to write specifications and reason about shared mutable data structures and embedded code pointers simultaneously. Note that it is assumed that in the derivations in the FRAME-ISEQ rule, there is no ECP or JMP rules being used.

**Example: destructive list-append function in CPS.** To demonstrate the above point, we verify a destructive version of the list-append example which Reynolds [54] used to define the ECP open problem. Following Reynolds, our destructive list-append function is written in continuation passing style (CPS):

```

append(x, y, rk) =
  if x == NULL then rk(y)
  else let k(z) = ([x+1] := z; rk(x))
       in append([x+1], y, k)

```

Here the *append* function takes three arguments: two lists *x* and *y* and a return continuation *rk*. If *x* is an empty list, it calls *rk* with list *y*. Otherwise, it first creates a new continuation function *k* which takes an (appended) list *z*, makes list *x*'s head node link to *z*, and passes the newly formed list (which is pointed to by *x*) to the return continuation *rk*. Variables *x* and *rk* form the closure environment for continuation function *k*. The *append*

function then recursively calls itself with the tail of list  $x$ , list  $y$ , and the new continuation  $k$ . For node  $x$ ,  $[x]$  is its data and  $[x + 1]$  is the link to the next node.

We do closure conversion and translate *append* into TM assembly code. In the presentation, we often write  $a$  for assertion  $\lambda(\mathbb{H}, \mathbb{R}). \exists x_1 : A_1, \dots, x_n : A_n. a$ , so all free variables in  $a$  are existentially quantified right after the lambda abstraction. Formulas such as  $(a * a' \mathbb{H})$  and  $\mathbb{R}(r) = w$  are also simplified to be written as  $a * a'$  and  $r = w$ .

Predicate  $(\text{list } ls \ 1)$  describes a linked list pointed to by  $1$  where the data cell of each node stores the value in  $ls$  respectively. Here  $ls$  is a mathematical list where  $\text{nil}$ ,  $w :: ls$  and  $ls ++ lt$  stand for the cases of empty list, cons, and append, respectively.

$$\text{list } \text{nil } 1 \triangleq \text{emp} \wedge \langle 1 = \text{NULL} \rangle$$

$$\text{list } (w :: ls) 1 \triangleq \exists 1'. 1 \mapsto w, 1' * \text{list } ls 1'$$



Predicate  $(\text{cont } a_{env} ls f)$  requires  $f$  to point to a continuation code block which expects an environment of type  $a_{env}$  and a list which stores  $ls$ .

$$\text{cont } a_{env} ls f \triangleq \text{codeptr}(f, \lambda S. \exists env, z. \langle r_0 = env \wedge r_1 = z \rangle \wedge a_{env} env * \text{list } ls z)$$



Predicate  $(\text{clos } ls 1)$  describes a continuation closure pointed to by  $1$ ; this closure is a pair  $(cnt, env)$  where  $cnt$  is a continuation function pointer and  $env$  points to an environment for  $cnt$ . The environment predicate  $a_{env}$  is hidden inside the closure predicate.

$$\text{clos } ls 1 \triangleq \exists a_{env}, cnt, env. 1 \mapsto cnt, env * a_{env} env \wedge \text{cont } a_{env} ls cnt$$



In Figure 4.2 and Figure 4.3, we list the precondition for each instruction on its right side. The instruction determines which well-formed instruction rule to use at each step. State diagrams are drawn before all the interesting steps.



Figure 4.2: Code, specification, and illustration of the list append function



Figure 4.3: Code, specification, and illustration of the list append function (continued)

Our specification of the *append* function guarantees that the return continuation *rk* will get a **correctly** appended list (*i.e.*, the list contents are exactly the same as the two input lists). Furthermore, even though the function contains a large amount of heap allocation, mutation, and deallocation (which are used to build and destroy continuation closures and to append two lists on the fly), memory safety is **fully** guaranteed (no garbage, no illegal free operation).

### 4.3 Recursive Specifications

Recursive specifications are very useful in describing complex invariants. Simple recursive data structures such as link-list are already supported by the inductive definition found in *Prop*. However, for the recursive types  $(\mu\alpha.\tau)$  found in TAL, where things such as embedded code pointers can be nested inside an inductive definition, their counterpart in logic, “recursive predicates”, can not be easily defined in XCAP. We extend *PropX* to support recursive predicates and use syntactic methods to establish validity.

We define the recursive predicate constructor as follows.

$$(PropX) \quad P, Q \quad ::= \quad \dots \mid (\mu\alpha:A \rightarrow PropX. \lambda x:A. P \ B)$$

To understand the formation of recursive predicate, we will start from the innermost proposition  $P$ .  $\lambda x:A. P$  is a predicate of type  $A \rightarrow PropX$ . So  $\mu\alpha:A \rightarrow PropX. \lambda x:A. P$  is meant to be a recursive predicate of type  $A \rightarrow PropX$ , which corresponds to recursive types found in type systems. Since the basic unit of definition is extended proposition instead of extended predicate, we apply it with a term  $B$  of type  $A$ , and make  $(\mu\alpha:A \rightarrow PropX. \lambda x:A. P \ B)$  the basic shape of recursive predicates formula. When using recursive predicates formulas, we often use its predicate form, and use the notation  $\mu\alpha:A \rightarrow PropX. \lambda x:A. P$  to represent predicate  $\lambda y:A. (\mu\alpha:A \rightarrow PropX. \lambda x:A. P \ y)$ .

To establish validity of recursive predicate formulas, we add the following rule to the validity rules in Figure 4.1. It is essentially a fold rule for recursive types.

$$\frac{B:A \quad \Gamma \vdash P[B/x][\mu\alpha:A \rightarrow \text{Prop}X.\lambda x:A.P/\alpha]}{\Gamma \vdash (\mu\alpha:A \rightarrow \text{Prop}X.\lambda x:A.P) B} (\mu^{-1})$$

We extend  $\text{Prop}X$  interpretation soundness (Theorem 4.1) with the following case:

If  $\llbracket (\mu\alpha:A \rightarrow \text{Prop}X.\lambda x:A.P) B \rrbracket_\Psi$  then  $\llbracket P[B/x][\mu\alpha:A \rightarrow \text{Prop}X.\lambda x:A.P/\alpha] \rrbracket_\Psi$ .

## 4.4 Discussion

In Figure 4.1 we have only included the introduction rules for the two impredicative quantifiers. This could cause confusion because from the logic perspective, missing the two elimination rules would raise questions related to the completeness of the logic. However, despite its name,  $\text{Prop}X$  is **not** designed to be a general (complete) logic; it is purely a level of syntax laid upon the meta logic. While its expressive power comes from the lifted propositions  $\langle p \rangle$ , the modular handling of ECPs and impredicative polymorphism follows syntactic types.

To certify the examples in this dissertation (or any polymorphic TAL programs), what we need is to establish the assertion subsumption relation  $\Rightarrow$  between XCAP assertions. According to its definition, assertion subsumption is merely a meta-implication between validities of XCAP propositions. Although in certain cases it is possible to first do all the subsumption reasoning in  $\text{Prop}X$  and prove  $\llbracket P \rightarrow Q \rrbracket_\Psi$ , and then obtain the subsumption proof  $\llbracket P \rrbracket_\Psi \supset \llbracket Q \rrbracket_\Psi$  by Theorem 4.1, it is not always possible due to the lack of completeness for  $\text{Prop}X$ , and is not the way  $\text{Prop}X$  should be used. Instead, one can always follow the diagram below in proving subsumption relations (we use the impredicative existential quantifier as an example):



To prove the intended “implication” relation (the top one), we first use Theorem 4.1 to turn the source proposition’s existential quantification into the meta one, from which we can do (flexible) meta implications. Then we reconstruct the existential quantification of the target proposition via the introduction rule. This way, the construction of subsumption proof in meta logic does not require the reasoning at the *PropX* level.

In fact, the subtyping relation found in TAL can be simulated by the subsumption relation in XCAP (with only the introduction rules for the two impredicative quantifiers). What the missing “elimination rules” would add is the ability to support a notion of “higher-order subtyping” between “impredicative types”, which does not appear in practical type systems such as TAL, FLINT, or ML. Although it could be nice to include such a feature in XCAP, we did not do so since that would require a complex semantic normalization proof instead of the simple syntactic one used for Theorem 4.1.

As we will show in the next Chapter, this is enough for reasoning about impredicative polymorphism available in typical type systems. Translations from polymorphic type systems such as TAL to XCAP also do not require the elimination rules.

Similarly explanation applies to recursive specifications, for which only the introduction rules are defined, too.



## Chapter 5

# Weak Updates and a Translation from Typed Assembly Language

Weak update, also termed as “general reference”, is another higher-order features that logic-based verification methods failed to support well. In this chapter we first show how to extend the XCAP framework to support weak update, using similar syntactic technique for the support of ECP in Chapter 3.

We then explore the relationship between XCAP and typed assembly languages (TAL). TAL and CAP/XCAP are suitable for different kinds of verification tasks. Previously, programs verified in either one of them can not interoperate freely with the other, making it hard to integrate them into a complete system. Moreover, the relationship between TAL and CAP lines of work has not been discussed extensively.

In this chapter, we compare the type-based and logic-based methods by presenting a type-preserving translation from a TAL language to XCAP. The translation involves an intermediate step of a “semantic” TAL language. Our translation supports polymorphic code, mutable reference, existential, and recursive types. Since we proved typing preservation for the translation from TAL to XCAP, there is a clear path to link and interoperate well-typed programs from traditional certifying compilers with certified libraries by CAP-like systems.

## 5.1 Weak Update in the Logical Setting

Weak update (also termed as “mutable reference” or “general references”) is a commonly used memory mutation model. Examples of weak update include ML reference cells (`int ref`) and managed data pointers (`int __gc*`) in .NET common type system. In the weak update model, the value of each memory cell must satisfy a certain fixed value type. The tuple type in TAL (such as the one to be shown in next section) is also a weak update reference cell types.

Unfortunately, weak update is not well-supported by CAP and other Hoare-logic-based verification systems. The problem is due to the lack of a global data heap invariant that local assertions about heap cells can be checked upon. Existing Hoare-logic-based systems either avoid supporting weak update [44, 64], limit the assertions (for reference cells) to types only [28], or resort to heavyweight techniques that require the construction of complex semantic models [9, 2]. In this section, we present a weak update extension of the XCAP using the similar syntactic technique for ECPs.

Following `cptr` for ECP, We add a reference cell proposition  $\text{ref}(1, t)$  to the extended propositions. It associates word type  $t$  (a value predicate) with data label 1.

$$\begin{aligned} (\text{Prop}X) \quad P, Q &::= \dots \mid \text{ref}(1, t) \\ (\text{Word}Ty) \quad t &\in \text{Word} \rightarrow \text{Prop}X \end{aligned}$$

We can use the following macro to describe a record of  $n$  cells.

$$\text{record}(1, t_1, \dots, t_n) \triangleq \text{ref}(1, t_1) \wedge \dots \wedge \text{ref}(1+n-1, t_n)$$

To testify the validity of reference cell propositions, in the interpretation  $[\![P]\!]_{\Psi, \Phi}$  we need an additional “data heap specification” parameter  $\Phi$  which, similar to the code heap specification  $\Psi$ , is a partial mapping from data labels to word types.

$$(\text{DtHpSpec}) \quad \Phi ::= \{1 \rightsquigarrow t\}^*$$

To establish validity of  $\text{ref}(1, t)$ , data label 1 and word type  $t$  need to be in  $\Phi$ .

$$\frac{\Phi(1) = t}{\Gamma \vdash_{\Psi, \Phi} \text{ref}(1, t)} \text{ (RF-I)} \quad \frac{\Gamma \vdash_{\Psi, \Phi} \text{ref}(1, t) \quad (\Phi(1) = t) \supset (\Gamma \vdash Q)}{\Gamma \vdash_{\Psi, \Phi} Q} \text{ (RF-E)}$$

Validity rules of other cases remain unchanged other than taking an extra  $\Phi$  argument.

Validity soundness of those cases also holds, with the following additional case.

If  $\llbracket \text{ref}(1, t) \rrbracket_{\Psi, \Phi}$  then  $\Phi(1) = t$ ;

Different from code heap, the data heap is dynamic. Its specification can not be obtained statically. Instead, we need to find it out in each execution steps. So the assertion interpretation is changed to:

$$\llbracket a \rrbracket_{\Psi} \triangleq \lambda(\mathbb{H}, \mathbb{R}). \exists \Phi, \mathbb{H}_s, \mathbb{H}_w. \mathbb{H} = \mathbb{H}_s \uplus \mathbb{H}_w \wedge \llbracket a (\mathbb{H}_s, \mathbb{R}) \rrbracket_{\Psi, \Phi} \wedge \mathcal{D}\mathcal{H} \Psi \Phi \mathbb{H}_w$$

There should exist a data heap specification  $\Phi$  describing the weak update part of current data heap. Other than checking validity of  $(a (\mathbb{H}, \mathbb{R}))$  using  $\Psi$  and  $\Phi$ , we also need to checking validity of  $\Phi$ .

For a data heap specification  $\Phi$  to be valid, each reference cell must contain a value that matches its word type.

$$\mathcal{D}\mathcal{H} \Psi \Phi \mathbb{H} \triangleq \forall 1 \in \text{dom}(\Phi) = \text{dom}(\mathbb{H}). \llbracket \Phi(1) \mathbb{H}(1) \rrbracket_{\Psi, \Phi}$$

The extension of XCAP to support weak update requires minor changes to the assertion languages and interpretations, and zero change to the inference rules. Thus the soundness of XCAP is easily preserved.

Based on the weak update memory model defined above, we can derive many useful “macro” inference rules as shown below. These rules can help guide the proof process for the programmer. ( $\text{insens}(a, r)$  asserts predicate is insensitive to register  $r$ , *i.e.*, does not talk about register  $r$ . Its definition is omitted here.)

$$\frac{\Psi \vdash \{(\lambda(\mathbb{H}, \mathbb{R}). a(\mathbb{H}, \mathbb{R}) \wedge t \mathbb{R}(r_d)))\} \mathbb{I} \quad \text{insens}(a, r_d)}{\Psi \vdash \{(\lambda(\mathbb{H}, \mathbb{R}). a(\mathbb{H}, \mathbb{R}) \wedge \text{ref}(\mathbb{R}(r_s) + w, t))\} \text{ld } r_d, r_s(w); \mathbb{I}} \text{ (W-LD)}$$

$$\frac{\Psi \vdash \{a\} \mathbb{I}}{\Psi \vdash \{(\lambda(\mathbb{H}, \mathbb{R}). a(\mathbb{H}, \mathbb{R}) \wedge \text{ref}(\mathbb{R}(r_d) + w, t) \wedge t \mathbb{R}(r_s)))\} \text{st } r_d(w), r_s; \mathbb{I}} \text{ (W-ST)}$$

**Example.** Below is a “mini object”, using the recursive predicates and weak update extensions presented in previous sections.

```
classs c {
  void f (c x) { x.f(x) }
}

c  $\triangleq$   $\mu \alpha:Word \rightarrow PropX. \lambda x:Word. ref(x, \lambda y:Word. cptr(y, \lambda(\mathbb{H}, \mathbb{R}).(\alpha \mathbb{R}(r1))))$ 
```

The above example may look similar to what one would normally write in a syntactic type system. However, given the ability to write general logic predicate in the specifications, it is possible to compose interesting data structures and properties. Below is an example: a record pointer  $l$  which points to an even number, a data pointer to a reference cell storing an odd number, and a code pointer which, among other things, expects register  $r_1$  to be an unaliased pointer to a prime number. (here we used the separation logic primitives embedded in Coq, as discussed in Section 4.2).

```
record(l, even,  $\lambda w. ref(w, \text{odd})$ ,  $\lambda w. cptr(w, \lambda(\mathbb{H}, \mathbb{R}).(\exists w. \mathbb{R}(r_1) \mapsto w \wedge \text{prime } w) * \dots)$ )
```

## 5.2 Typed Assembly Language (TAL)

The TAL language presented here follows the principle of the original typed assembly languages [41]. However, due to the usage of TM, a untyped raw machine, the shape of typing judgments and rules are slightly different. To simplify our presentation, the TAL in this chapter does not directly deal with heap allocation/deallocation.

**Type definitions.** Figure 5.1 presents the type definitions in TAL. Machine word is classified as of value type ( $\tau$ ) including integer, code, tuple, existential package, and recursive data structures. A code heap specification ( $\Psi$ ) is a partial environment that maps a code label to a “precondition” type ( $[\Delta].\Gamma$ ) for its corresponding code block. Here  $\Delta$  is a type variable environment and  $\Gamma$  is a register file type which specifies the type for each register. Similarly, a data heap specification ( $\Phi$ ) is a partial environment that maps from a data label to a value type for its corresponding heap cell.

$$\begin{aligned}
(CdHpSpec) \quad \Psi &::= \{f \rightsquigarrow [\Delta].\Gamma\}^* \\
(RfileTy) \quad \Gamma &::= \{r \rightsquigarrow \tau\}^* \\
(TyVarEnv) \quad \Delta &::= \cdot \mid \alpha, \Delta \\
(WordTy) \quad \tau &::= \alpha \mid \text{int} \mid \text{code } [\Delta].\Gamma \mid \langle \tau_1, \dots, \tau_n \rangle \mid \exists \alpha. \tau \mid \mu \alpha. \tau \\
(DtHpSpec) \quad \Phi &::= \{l \rightsquigarrow \tau\}^*
\end{aligned}$$

Figure 5.1: Type definitions of TAL

**Static semantics.** The top-level semantic rules of TAL are presented in Figure 5.2. A program is well-formed if each of its components is. For a code heap to be well-formed, each block in it must be well-formed. The intuition behind well-formed instruction sequence judgment is that if the state satisfies the precondition  $[\Delta].\Gamma$ , then executing  $\mathbb{I}$  is safe with respect to  $\Psi$ . Weakening is allowed to turn one precondition into another provided that they satisfy the subtyping relation. A instruction sequence  $c;\mathbb{I}$  is safe if one can find another register file type which serves as both the post-condition of  $c$  and the precondition of  $\mathbb{I}$ . A direct jump is safe if the current precondition implies the precondition of the target code block specified in  $\Psi$ . An indirect jump is safe when the target register is of a code type with a weaker precondition. Constant code labels can be moved into registers.

The subtyping and instruction typing rules for TAL is presented in Figure 5.3. Valid subtypings include dropping registers, instantiation of code type, packing and unpacking of existential packages, and folding and unfolding of recursive types. It allows jumping to code blocks with stronger preconditions than required.

For each non-control-flow-transfer instruction there is a pre/post-condition relation defined. When a register is updated by an instruction, the register file type is also updated by a new value type. Arithmetic instruction only operates on two registers with integer types. Since there is no static data heap, a constant value can only be well-formed under an empty Heap type before it can be moved to a register. For simple instructions, their pre- and post-conditions do not involve a change of type variable environment, thus we only specify the register file types before and after their execution. The instruction-level

$\boxed{\Psi_G \vdash \{[\Delta].\Gamma\} \mathbb{P}}$  **(Well-formed Program)**

$$\frac{\Psi_G \vdash \mathbb{C} : \Psi_G \quad \Psi_G \vdash \mathbb{S} : [\Delta].\Gamma \quad \Psi_G \vdash \{[\Delta].\Gamma\} \mathbb{I}}{\Psi_G \vdash \{[\Delta].\Gamma\} (\mathbb{C}, \mathbb{S}, \mathbb{I})} \text{ (PROG)}$$

$\boxed{\Psi_{IN} \vdash \mathbb{C} : \Psi}$  **(Well-formed Code Heap)**

$$\frac{\Psi_{IN} \vdash \{\Psi(f)\} \mathbb{C}(f) \quad \forall f \in \text{dom}(\Psi)}{\Psi_{IN} \vdash \mathbb{C} : \Psi} \text{ (CDHP)}$$

$$\frac{\Psi_{IN1} \vdash \mathbb{C}_1 : \Psi_1 \quad \Psi_{IN2} \vdash \mathbb{C}_2 : \Psi_2 \quad \Psi_{IN1}(f) = \Psi_{IN2}(f) \quad \text{dom}(\mathbb{C}_1) \cap \text{dom}(\mathbb{C}_2) = \emptyset \quad \forall f \in \text{dom}(\Psi_{IN1}) \cap \text{dom}(\Psi_{IN2})}{\Psi_{IN1} \cup \Psi_{IN2} \vdash \mathbb{C}_1 \cup \mathbb{C}_2 : \Psi_1 \cup \Psi_2} \text{ (LINK)}$$

$\boxed{\Psi \vdash \{[\Delta].\Gamma\} \mathbb{I}}$  **(Well-formed Instruction Sequence)**

$$\frac{\vdash \{\Gamma\} c \{ \Gamma' \} \quad \Psi \vdash \{[\Delta].\Gamma'\} \mathbb{I} \quad c \in \{\text{add, addi, mov, movi, ld, st}\}}{\Psi \vdash \{[\Delta].\Gamma\} c; \mathbb{I}} \text{ (SEQ)}$$

$$\frac{f \in \text{dom}(\Psi) \quad \vdash [\Delta].\Gamma \leq \Psi(f)}{\Psi \vdash \{[\Delta].\Gamma\} \text{ jd } f} \text{ (JD)}$$

$$\frac{\Gamma(r) = \text{code } [\Delta'].\Gamma' \quad \vdash [\Delta].\Gamma \leq [\Delta'].\Gamma'}{\Psi \vdash \{[\Delta].\Gamma\} \text{ jmp } r} \text{ (JMP)}$$

$$\frac{f \in \text{dom}(\Psi) \quad \Gamma(r_s) = \text{int} \quad \Psi \vdash \{[\Delta].\Gamma\} \mathbb{I} \quad \vdash [\Delta].\Gamma \leq \Psi(f)}{\Psi \vdash \{[\Delta].\Gamma\} \text{ bgti } r_s, i, f; \mathbb{I}} \text{ (BGTI)}$$

$$\frac{f \in \text{dom}(\Psi) \quad \Psi \vdash \{[\Delta].\Gamma \{ r_d \sim \text{code } \Psi(f) \} \} \mathbb{I}}{\Psi \vdash \{[\Delta].\Gamma\} \text{ movi } r_d, f; \mathbb{I}} \text{ (MOVF)}$$

$$\frac{\vdash [\Delta].\Gamma \leq [\Delta'].\Gamma' \quad \Psi \vdash \{[\Delta'].\Gamma'\} \mathbb{I}}{\Psi \vdash \{[\Delta].\Gamma\} \mathbb{I}} \text{ (WEAKEN)}$$

Figure 5.2: Top-level static semantics of TAL

$\vdash [\Delta].\Gamma \leq [\Delta'].\Gamma'$

(*Subtyping*)

$$\frac{\Delta \supseteq \Delta' \quad \forall \mathbf{r} \in \text{dom}(\Gamma') \quad \Gamma(\mathbf{r}) = \Gamma'(\mathbf{r}) \quad \Delta' \vdash \Gamma'(\mathbf{r})}{\vdash [\Delta].\Gamma \leq [\Delta'].\Gamma'} \text{ (SUBT)}$$

$$\frac{\Gamma(\mathbf{r}) = \text{code } [\alpha, \Delta'].\Gamma' \quad \Delta \vdash \tau'}{\vdash [\Delta].\Gamma \leq [\Delta].\Gamma\{\mathbf{r} : \text{code } [\Delta'].\Gamma'[\tau'/\alpha]\}} \text{ (TAPP)}$$

$$\frac{\Gamma(\mathbf{r}) = \tau[\tau'/\alpha] \quad \Delta \vdash \tau'}{\vdash [\Delta].\Gamma \leq [\Delta].\Gamma\{\mathbf{r} : \exists \alpha. \tau\}} \text{ (PACK)}$$

$$\frac{\Gamma(\mathbf{r}) = \exists \alpha. \tau}{\vdash [\Delta].\Gamma \leq [\alpha, \Delta].\Gamma\{\mathbf{r} : \tau\}} \text{ (UNPACK)}$$

$$\frac{\Gamma(\mathbf{r}) = \tau[\mu \alpha. \tau / \alpha]}{\vdash [\Delta].\Gamma \leq [\Delta].\Gamma\{\mathbf{r} : \mu \alpha. \tau\}} \text{ (FOLD)}$$

$$\frac{\Gamma(\mathbf{r}) = \mu \alpha. \tau}{\vdash [\Delta].\Gamma \leq [\Delta].\Gamma\{\mathbf{r} : \tau[\mu \alpha. \tau / \alpha]\}} \text{ (UNFOLD)}$$

$\vdash \{\Gamma\} \subset \{\Gamma'\}$

(*Well-formed Instruction*)

$$\frac{\Gamma(\mathbf{r}_s) = \Gamma(\mathbf{r}_t) = \text{int}}{\vdash \{\Gamma\} \text{ add } \mathbf{r}_d, \mathbf{r}_s \mathbf{r}_t \{\Gamma\{\mathbf{r}_d : \text{int}\}\}} \text{ (ADD)}$$

$$\frac{\Gamma(\mathbf{r}_s) = \text{int}}{\vdash \{\Gamma\} \text{ addi } \mathbf{r}_d, \mathbf{r}_s, i \{\Gamma\{\mathbf{r}_d : \text{int}\}\}} \text{ (ADDI)}$$

$$\frac{\Gamma(\mathbf{r}_s) = \tau}{\vdash \{\Gamma\} \text{ mov } \mathbf{r}_d, \mathbf{r}_s \{\Gamma\{\mathbf{r}_d : \tau\}\}} \text{ (MOV)}$$

$$\frac{}{\vdash \{\Gamma\} \text{ movi } \mathbf{r}_d, \mathbf{w} \{\Gamma\{\mathbf{r}_d : \text{int}\}\}} \text{ (MOVI)}$$

$$\frac{\Gamma(\mathbf{r}_s) = \langle \tau_1, \dots, \tau_{w+1}, \dots, \tau_n \rangle}{\vdash \{\Gamma\} \text{ ld } \mathbf{r}_d, \mathbf{r}_s(\mathbf{w}) \{\Gamma\{\mathbf{r}_d : \tau_{w+1}\}\}} \text{ (LD)}$$

$$\frac{\Gamma(\mathbf{r}_d) = \langle \tau_1, \dots, \tau_{w+1}, \dots, \tau_n \rangle \quad \Gamma(\mathbf{r}_s) = \tau_{w+1}}{\vdash \{\Gamma\} \text{ st } \mathbf{r}_d(\mathbf{w}), \mathbf{r}_s \{\Gamma\}} \text{ (ST)}$$

Figure 5.3: Static semantics of TAL (subtyping and instruction typing)

typing rules is not very flexible and expressive, *e.g.*, pointer arithmetic is not possible.

The typing rules for value types, machine state, register file, data heap, and machine word values are presented in Figure 5.4. A value type is well-formed only when it contains no free type variable and, thus, is a ground type. The well-formed state rule instantiates the current type variable environment and requires a current data heap specification to be supplied and checked. State typing is done by checking the Heap and register file’s well-formednesses separately under this heap specification. Heap and register file typing further break heap and register file into single word values, and check their well-formedness individually. Any machine word value can be typed as an integer. A label can be typed as a tuple pointer if the cell types in the heap type starting from the label match the corresponding value types in the tuple type. For a label to be considered a code pointer, the code precondition in it has to match the precondition listed in the global context.

**Soundness.** The soundness theorem guarantees that given a well-formed program, the machine will never get stuck. It is proved following the syntactic approach of proving type soundness [59]. We also list a few key lemmas below.

#### **Theorem 5.1 (TAL Soundness)**

If  $\Psi \vdash \{[\Delta].\Gamma\} \mathbb{P}$ , for all natural number  $n$ , there exists a program  $\mathbb{P}'$  such that  $\mathbb{P} \xrightarrow{n} \mathbb{P}'$ .

#### **Lemma 5.2 (TAL State Weakening)**

If  $\Psi \vdash \mathbb{S}:[\Delta].\Gamma$  and  $\vdash [\Delta].\Gamma \leq [\Delta'].\Gamma'$  then  $\Psi \vdash \mathbb{S}:[\Delta'].\Gamma'$ .

#### **Lemma 5.3 (TAL Instruction Typing)**

If  $\Psi \vdash \mathbb{S}:[\Delta].\Gamma$  and  $\vdash \{\Gamma\} c \{\Gamma'\}$  then  $\Psi \vdash \text{Next}_C(\mathbb{S}) : [\Delta].\Gamma'$ .

### 5.3 A “Semantic” TAL Language

Instead of doing translation from TAL to XCAP directly, we create a new “Semantic” TAL language (STAL) to serve as an intermediate step between them. Let us revisit the static

$$\boxed{\Delta \vdash \tau \quad \Psi \vdash \mathbb{S} : [\Delta].\Gamma \quad \Psi \vdash \mathbb{H} : \Phi \quad \Psi ; \Phi \vdash \mathbb{R} : \Gamma}$$

(Well-formed Type, State, Heap, and Register file)

$$\begin{array}{c}
 \frac{FTV(\tau) \subseteq \Delta}{\Delta \vdash \tau} \text{ (TYPE)} \\
 \\
 \frac{\cdot \vdash \tau_i \quad \forall i \quad \Psi \vdash \mathbb{H} : \Phi \quad \Psi ; \Phi \vdash \mathbb{R} : \Gamma[\tau_1, \dots, \tau_n / \alpha_1, \dots, \alpha_n]}{\Psi \vdash (\mathbb{H}, \mathbb{R}) : [\alpha_1, \dots, \alpha_n].\Gamma} \text{ (STATE)} \\
 \\
 \frac{\Psi ; \Phi \vdash \mathbb{H}(1) : \Phi(1) \quad \forall 1 \in \text{dom}(\Phi) = \text{dom}(\mathbb{H})}{\Psi \vdash \mathbb{H} : \Phi} \text{ (HEAP)} \\
 \\
 \frac{\Psi ; \Phi \vdash \mathbb{R}(r) : \Gamma(r) \quad \forall r \in \text{dom}(\Gamma)}{\Psi ; \Phi \vdash \mathbb{R} : \Gamma} \text{ (RFILE)}
 \end{array}$$

$$\boxed{\Psi ; \Phi \vdash w : \tau} \quad (\text{Well-formed Word Value})$$

$$\begin{array}{c}
 \frac{}{\Psi ; \Phi \vdash w : \text{int}} \text{ (INT)} \\
 \\
 \frac{f \in \text{dom}(\Psi)}{\Psi ; \Phi \vdash f : \text{code } \Psi(f)} \text{ (CODE)} \\
 \\
 \frac{\cdot \vdash \tau' \quad \Psi ; \Phi \vdash f : \text{code } [\alpha, \Delta].\Gamma}{\Psi ; \Phi \vdash f : \text{code } [\Delta].\Gamma[\tau' / \alpha]} \text{ (POLY)} \\
 \\
 \frac{\Phi(1+i-1) = \tau_i \quad \forall i}{\Psi ; \Phi \vdash 1 : \langle \tau_1, \dots, \tau_n \rangle} \text{ (TUP)} \\
 \\
 \frac{\cdot \vdash \tau' \quad \Psi ; \Phi \vdash w : \tau[\tau' / \alpha]}{\Psi ; \Phi \vdash w : \exists \alpha. \tau} \text{ (EXT)} \\
 \\
 \frac{\Psi ; \Phi \vdash w : \tau[\mu \alpha. \tau / \alpha]}{\Psi ; \Phi \vdash w : \mu \alpha. \tau} \text{ (REC)}
 \end{array}$$

Figure 5.4: Static semantics of TAL (state and value typing)

semantics of TAL in Figure 5.3 and the lemmas used in TAL soundness proof.

First, look at the set of subtyping rules between preconditions ( $\vdash [\Delta].\Gamma \leq [\Delta'].\Gamma'$ ). If we also look at the State Weakening lemma (Lemma 5.2) in TAL soundness proof, it is easy to see that all of those syntactic rules are just used for the meta implication between the two state typings in the soundness proof. Given a mechanized meta logic, we can replace these rules with a single one,

$$\frac{\Psi \vdash \mathbb{S}:[\Delta].\Gamma \supset \Psi \vdash \mathbb{S}:[\Delta'].\Gamma' \quad \forall \mathbb{S}, \Psi}{\vdash [\Delta].\Gamma \leq [\Delta'].\Gamma'} \text{ (SUBT)}$$

which explicitly requests the meta implications between two state typing. By doing so, not only did we remove the fixed syntactic subtyping rules from TAL, but the State Weakening lemma is no longer part of the soundness proof. More importantly, the subtyping between preconditions is no longer limited to the built-in rules. Any valid implication relation between state typing is allowed. This is much more flexible and powerful.

Now let us look at the set of instruction typing rules ( $\vdash \{\Gamma\}c\{\Gamma'\}$ ). Also look at the Instruction Typing lemma (Lemma 5.3) in TAL soundness proof. Again all these syntactic rules are just used for the meta implication between two state typings in the soundness proof. (The difference this time is that the two states are now different and are states before and after the execution of an instruction, respectively.) Applying our trick again, we can replace these rules with a single one,

$$\frac{\Psi \vdash \mathbb{S}:[\Delta].\Gamma \supset \Psi \vdash \text{Next}_C(\mathbb{S}):[\Delta].\Gamma' \quad \forall \mathbb{S}, \Psi}{\vdash \{\Gamma\}c\{\Gamma'\}} \text{ (INSTR).}$$

We also successfully removed the fixed syntactic instruction typing rules as well as the Instruction Typing lemma from TAL. The new form of instruction typing is much more flexible and powerful.

Of course, in composing the actual meta proof supplied to the new SUBT and INSTR rules, it is most likely that those disappeared lemmas will still be used. Nevertheless, making them separate from the type language and its meta theory is important because it reduces the size of the type language, while allowing more flexible reasoning. By intro-

ducing meta implication into TAL, we made one step forward so now there is a mixture of syntactic types and logic proofs.

We call this version of TAL a *semantic* TAL (STAL). STAL and TAL share the exactly same syntax and top-level static semantics (Figure 5.2), as well as the same state and value typing rules (Figure 5.4). While STAL has much more reasoning power than TAL do, as the soundness of STAL is simpler than TAL's. Nevertheless, instead of merely supplying type signatures to their code, now the programmers have to also supply meta logic proof.

As an intermediate step from TAL toward XCAP, STAL only upgrades TAL's expressiveness by using general logic implications in the subtyping and instruction typing rules. STAL program specifications are still fully syntactic types and thus are not as expressive as general logic predicate.

It is obvious to obtain the following TAL-to-STAL typing translation theorem. (We ignore the trivial cases of those judgments where TAL and STAL share the same rules.)

**Theorem 5.4 (Typing Preservations from TAL to STAL)**

1. if  $\vdash_{\text{TAL}} [\Delta].\Gamma \leq [\Delta'].\Gamma'$  then  $\vdash_{\text{STAL}} [\Delta].\Gamma \leq [\Delta'].\Gamma'$ ;
2. if  $\Psi \vdash_{\text{TAL}} \{\Gamma\}c\{\Gamma'\}$  then  $\Psi \vdash_{\text{STAL}} \{\Gamma\}c\{\Gamma'\}$ .

If we follow the direction of STAL and make one more step by trying to bring in general logic predicates into the types, we can obtain an even better TAL. In fact, XCAP is such a system in some sense, as the translation in the next section will show.

## 5.4 Translations from TAL/STAL to XCAP

The gap between STAL and XCAP is mainly on the specification language and the various state and value typing. At the core of the translation from STAL to XCAP is the translations from TAL/STAL types and typings into XCAP predicates, which we discuss first. Our translations preserve the typing structure and well-formedness of TM programs.

*Translation from TAL WordTy to XCAP WordTy*

$$\begin{aligned}
 \lceil \text{int} \rceil &\triangleq \lambda w. \text{True} \\
 \lceil \text{code } [\Delta]. \Gamma \rceil &\triangleq \lambda w. \text{codeptr}(w, \lceil [\Delta]. \Gamma \rceil) \\
 \lceil \langle \tau_1, \dots, \tau_n \rangle \rceil &\triangleq \lambda w. \text{record}(w, \lceil \tau_1 \rceil, \dots, \lceil \tau_n \rceil) \\
 \lceil \exists \alpha. \tau \rceil &\triangleq \lambda w. \exists \alpha: \text{Word} \rightarrow \text{PropX}. \lceil \tau \rceil w \\
 \lceil \mu \alpha. \tau \rceil &\triangleq \lambda w. (\mu \alpha: \text{Word} \rightarrow \text{PropX}. \lambda x: \text{Word}. (\lceil \tau \rceil x) \wedge w)
 \end{aligned}$$

*Translation from TAL Precondition to XCAP Assertion*

$$\begin{aligned}
 \lceil [\alpha_1, \dots, \alpha_m]. \{r_1 \rightsquigarrow \tau_1, \dots, r_n \rightsquigarrow \tau_n\} \rceil \\
 \triangleq \lambda (\mathbb{H}, \mathbb{R}). \exists \alpha_1, \dots, \alpha_m: \text{Word} \rightarrow \text{PropX}. (\lceil \tau_1 \rceil \mathbb{R}(r_1)) \wedge \dots \wedge (\lceil \tau_n \rceil \mathbb{R}(r_n))
 \end{aligned}$$

*Translation from TAL CdHpSpec to XCAP CdHpSpec*

$$\lceil \{l_1 \rightsquigarrow [\Delta_1]. \Gamma_1, \dots, l_n \rightsquigarrow [\Delta_n]. \Gamma_n\} \rceil \triangleq \{l_1 \rightsquigarrow \lceil [\Delta_1]. \Gamma_1 \rceil, \dots, l_n \rightsquigarrow \lceil [\Delta_n]. \Gamma_n \rceil\}$$

*Translation from TAL DtHpSpec to XCAP DtHpSpec*

$$\lceil \{l_1 \rightsquigarrow \tau_1, \dots, l_n \rightsquigarrow \tau_n\} \rceil \triangleq \{l_1 \rightsquigarrow \lceil \tau_1 \rceil, \dots, l_n \rightsquigarrow \lceil \tau_n \rceil\}$$

Figure 5.5: Translations from TAL types to XCAP predicates

**Translation of TAL types** We present the type translations from TAL/STAL to XCAP in Figure 5.5. Various  $\lceil \cdot \rceil$  translate TAL types into XCAP assertions and specifications.

In the word type translation, integer type becomes a tautology as any machine word can be treated as an integer. Code type is translated into ECP formulas. Tuple types in TAL is translated into record type in XCAP. Existential types and recursive types in TAL are also translated into their XCAP counterparts.

The translation of a TAL precondition, which is a type variable environment plus a register file type, is an XCAP assertion. The type variables in the environment are now existentially quantified over XCAP word types at the outmost of the target assertion. The register file typing corresponds to a bunch of conjunctions of register value testing.

The translations of TAL code and data heap specifications are carried out by simply translating each element's type in them into XCAP assertions or word types, and preserving the partial mapping.

**Typing Preservations** An important property of the previous translations is whether they preserve the typing structure and well-formedness of TM program in XCAP. This breaks down to whether all well-formed TAL/STAL entities are still well-formed in XCAP after the translations. As the following typing preservation lemma shows, all STAL typing derivations indeed get preserved in XCAP after the translations.

**Theorem 5.5 (Typing Preservations from STAL to XCAP)**

1. If  $\Psi \vdash_{\text{STAL}} \{[\Delta].\Gamma\} \mathbb{P}$  then  $\Gamma \Psi \vdash_{\text{XCAP}} \{\Gamma[\Delta].\Gamma\} \mathbb{P}$ ;
2. if  $\Psi \vdash_{\text{STAL}} \mathbb{C}:\Psi'$  then  $\Gamma \Psi \vdash_{\text{XCAP}} \mathbb{C}:\Gamma \Psi'$ ;
3. if  $\Psi \vdash_{\text{STAL}} \{[\Delta].\Gamma\} \mathbb{I}$  then  $\Gamma \Psi \vdash_{\text{XCAP}} \{\Gamma[\Delta].\Gamma\} \mathbb{I}$ ;
4. if  $\Psi \vdash_{\text{STAL}} \mathbb{S}:[\Delta].\Gamma$  then  $[\Gamma[\Delta].\Gamma]_{\Gamma \Psi} \mathbb{S}$ ;
5. if  $\Psi \vdash_{\text{STAL}} \mathbb{H}:\Phi$  then  $\mathcal{D}\mathcal{H} \Gamma \Psi \vdash_{\text{XCAP}} \Gamma \Phi \mathbb{H}$ ;
6. if  $\Psi;\Phi \vdash_{\text{STAL}} \mathbb{R}:\Gamma$  then  $[\Gamma].\Gamma \vdash_{\Gamma \Psi, \Gamma \Phi} (\mathbb{H}, \mathbb{R})$ ;
7. if  $\Psi;\Phi \vdash_{\text{STAL}} \mathbb{w}:\tau$  then  $[\Gamma \tau \vdash_{\Gamma \Psi, \Gamma \Phi} \mathbb{w}]$ ;
8. if  $\vdash_{\text{STAL}} [\Delta].\Gamma \leq [\Delta'].\Gamma'$  then  $\Gamma[\Delta].\Gamma \Rightarrow \Gamma[\Delta'].\Gamma'$ ;
9. if  $\Psi \vdash_{\text{STAL}} \{\Gamma\} \mathbb{c}\{\Gamma'\}$  then  $\Gamma[\Delta].\Gamma \vdash_{\Psi} \mathbb{c} \Rightarrow \Gamma[\Delta].\Gamma'$ .

**Proof.** We show selected cases for (3). By induction over the structure of  $\Psi \vdash_{\text{STAL}} \{[\Delta].\Gamma\} \mathbb{I}$ .

**case WEAKEN.**

$$\frac{\vdash [\Delta].\Gamma \leq [\Delta'].\Gamma' \quad \Psi \vdash \{[\Delta'].\Gamma'\} \mathbb{I}}{\Psi \vdash \{[\Delta].\Gamma\} \mathbb{I}}$$

By induction hypodissertation it follows that  $\Gamma \Psi \vdash_{\text{XCAP}} \{\Gamma[\Delta'].\Gamma'\} \mathbb{I}$ . The following weakening lemma easily holds for XCAP: “if  $\Psi \vdash \{a'\} \mathbb{I}$  and  $a \Rightarrow a'$  then  $\Psi \vdash \{a\} \mathbb{I}$ ”. By (8) it follows that  $\Gamma[\Delta].\Gamma \Rightarrow \Gamma[\Delta'].\Gamma'$ . Thus it follows that  $\Gamma \Psi \vdash_{\text{XCAP}} \{\Gamma[\Delta].\Gamma\} \mathbb{I}$ .

**case SEQ.**

$$\frac{\vdash \{\Gamma\} c \{\Gamma'\} \quad \Psi \vdash \{[\Delta].\Gamma'\} \mathbb{I} \quad c \in \{\text{add, addi, mov, movi, ld, st}\}}{\Psi \vdash \{[\Delta].\Gamma\} c; \mathbb{I}}$$

By induction hypodissertation it follows that  $\vdash \Psi \vdash_{\text{XCAP}} \{\vdash [\Delta].\Gamma'\} \mathbb{I}$ . By (9) it follows that  $\vdash [\Delta].\Gamma \vdash_{\Psi} \Rightarrow_{\mathbb{C}} \vdash [\Delta].\Gamma'$ . By rule SEQ it follows that  $\vdash \Psi \vdash_{\text{XCAP}} \{\vdash [\Delta].\Gamma\} c; \mathbb{I}$ .

case JD.

$$\frac{f \in \text{dom}(\Psi) \quad \vdash [\Delta].\Gamma \leq \Psi(f)}{\Psi \vdash \{[\Delta].\Gamma\} \text{jd } f}$$

By (8) it follows that  $\vdash [\Delta].\Gamma \Rightarrow \vdash \Psi(f) \mathbb{I}$ , and then  $\vdash [\Delta].\Gamma \Rightarrow \vdash \Psi(f)$ . By rule JD it follows that  $\vdash \Psi \vdash_{\text{XCAP}} \{\vdash [\Delta].\Gamma\} \text{jd } f$ .

case JMP.

$$\frac{\Gamma(r) = \text{code } [\Delta'].\Gamma' \quad \vdash [\Delta].\Gamma \leq [\Delta'].\Gamma'}{\Psi \vdash \{[\Delta].\Gamma\} \text{jmp } r}$$

By translation it follows that  $\vdash [\Delta].\Gamma \Rightarrow \lambda(\mathbb{H}, \mathbb{R}).\text{cptr}(\mathbb{R}(r), \vdash [\Delta'].\Gamma') \mathbb{I}$ . By (8) it follows that  $\vdash [\Delta].\Gamma \Rightarrow \vdash [\Delta'].\Gamma' \mathbb{I}$ . By rule JMP it follows that  $\vdash \Psi \vdash_{\text{XCAP}} \{\vdash [\Delta].\Gamma\} \text{jmp } r$ .

case MOVF.

$$\frac{f \in \text{dom}(\Psi) \quad \Psi \vdash \{[\Delta].\Gamma\{r_d \rightsquigarrow \text{code } \Psi(f)\}\} \mathbb{I}}{\Psi \vdash \{[\Delta].\Gamma\} \text{movi } r_d, f; \mathbb{I}}$$

By induction hypodissertation it follows that  $\vdash \Psi \vdash_{\text{XCAP}} \{\vdash [\Delta].\Gamma\{r_d \rightsquigarrow \text{code } \Psi(f)\}\} \mathbb{I}$ . By rule ECP and SEQ it follows that  $\vdash \Psi \vdash_{\text{XCAP}} \{\vdash [\Delta].\Gamma\} \text{movi } r_d, f; \mathbb{I}$ . ■

## 5.5 Discussion

We discussed the relationship between TAL and XCAP by showing a translation from TAL to XCAP. However, it is by no means limited to the translation itself.

For example, one can treat the translation as a shallow embedding of TAL types and typing rules in XCAP, which means there is no need to build meta theory for TAL while still letting the programmer work at the TAL level.

One interesting way to view the XCAP system is from TAL programmer's perspective. When writing programs that will interact with XCAP code that involve complex logical

specification not expressible in TAL, so long as the interfaces abstract and hide that part of “real” logical specification (*e.g.*, by using existential packages), and local code behavior is simple enough (which is usually the case), the programmers can “pretend” that they are dealing with external TAL code instead of XCAP ones, by using the macros and lemmas defined for the translations. This lowers the requirement on programmers, as they do not need to learn XCAP, even if they are dealing with external XCAP code.



# Chapter 6

## A Port to x86 Machine

To demonstrate the potential of the XCAP framework, we want to use it to reason about realistic system code that actually runs the x86 machine, which is different from and more complex than the RISC-like target machine used in previous chapters. In this chapter, we present XCAP86, a port of the XCAP framework on Mini86, a faithful subset of the x86 architecture. XCAP86 and Mini86 adds the support of instruction decoding, finite machine word, word-aligned byte-addressed memory, conditional flags, built-in stack and push/pop instructions, and function call/return instructions.

The Mini86 machine is realistic, which brings additional complexities and proof engineering issues. Therefore, on top of XCAP86, we made practical adaptations and built useful abstractions, particularly on the handling of the stack and function calls. We demonstrate the usage of the XCAP86 and these abstractions for the verification of a polymorphic queue module, which is going to be used by the mini thread library in the next chapter.

### 6.1 Mini86: a Subset of the x86 Architecture

The execution environment of Mini86 (Figure 6.1) consists of a memory, a register file of eight general-purpose registers, a flags register made up of a carry bit and a zero bit, and a program counter. Following the x86 *Flat Model* [34], the memory of Mini86 appears



Figure 6.1: Mini86 execution environment

to a program as a single, continuous address space. Code, data (static and dynamic), and system stack all reside in this address space. Mini86 has a word size of 32 bits. Its memory is finite and ultimately restricted by the word size. The non-code part of the memory, the register file, and the flags register are referred to collectively as the machine state.

In comparison, the actual x86 execution environment uses an EFLAGS register to record a group of status and control flags including those mentioned above, and the eip (programmer counter) register can also be controlled implicitly by interrupts and exceptions, which we omit. We also omit the segment registers in x86, because they play no role in the Flat memory model.

The syntax of Mini86 is shown in Figure 6.2. In Mini86, two memory addressing modes are supported: a “direct” mode uses an immediate value as the absolute address; a “base indexed” mode uses a register value as the base address and an immediate value as the offset. Common instructions are included for arithmetic, data movement, comparison, control flow transfer, and stack manipulation.

An operand (*o*) is either an immediate value or a register. When an instruction takes two operands, the first one is the target operand. Conditional jumps are supported with help of conditional codes (a, b, and e stands for *above*, *below*, and *equal* respectively), which are represented by different combinations of cf and zf, and set by arithmetic and comparison instructions.

We present the operational semantics of Mini86 in Figure 6.3. All instructions are

```

(Program)  $\mathbb{P} ::= (\mathbb{S}, pc)$ 
(State)  $\mathbb{S} ::= (\mathbb{H}, \mathbb{R}, \mathbb{F})$ 
(Mem)  $\mathbb{H} ::= \{l \rightsquigarrow w\}^*$ 
(Rfile)  $\mathbb{R} ::= \{r \rightsquigarrow w\}^*$ 
(FReg)  $\mathbb{F} ::= \{cf \rightsquigarrow b, zf \rightsquigarrow b\}$ 
(Word)  $w ::= i$  (unsigned 32 bit integers)
(Labels)  $l ::= i$  (unsigned 32 bit integers, i%4=0)
(CdLbl)  $f ::= i$  (unsigned 32 bit integers)
(Reg)  $r ::= eax | ebx | ecx | edx | esi | edi | ebp | esp$ 
(Bool)  $b ::= tt | ff$ 
(Cond)  $cc ::= a | ae | b | be | e | ne$ 
(Addr)  $d ::= i | r \pm i$ 
(Opr)  $o ::= i | r$ 
(Instr)  $c ::= add r, o | sub r, o | mov r, o | mov r, [d] | mov [d], o$ 
          $| cmp r, o | jcc f | jmp o | push o | pop r | pop | call o | ret$ 

```

Figure 6.2: Mini86 syntax

encoded as machine words and stored in the memory; at each step, the machine has to decode the instruction from the memory at  $pc$ . During execution, the machine would fetch words based on the program counter  $pc$  and decode the words before executing it.

We use  $Dc$  to fetch and decode an instruction out of the memory  $\mathbb{H}$  given a program counter  $pc$ . The result of  $Dc$  is an instruction  $c$  and a new program counter  $npc$ . We also use a macro  $\text{Next}_c(\mathbb{S})$  to define the transition of the machine state on some of the instructions. It is worth noting that this macro provides only a partial mapping: there are cases where no valid next states are defined. Examples include accessing invalid memory or stack locations. The Mini86 machine gets stuck in these cases.

The stack instructions of Mini86 assume that the stack pointer is held by  $esp$ . As a general rule,  $esp$  should not be used for any other purposes. For example, the  $push o$  instruction decrements  $esp$  by 4 and writes the value of  $o$  onto the new top of the stack. The  $pop r$  instruction reads a value from the top of the stack, puts it into  $r$ , and increments  $esp$  by 4.

| Suppose $Dc(\mathbb{H}, pc) = (c, npc)$ |                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
|-----------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| if $c =$                                | then $((\mathbb{H}, \mathbb{R}, \mathbb{F}), pc) \longmapsto$                                                                                                                                                                                                                                                                                                                                                                                                               |
| jmp o                                   | $(\mathbb{H}, \mathbb{R})\hat{\mathbb{R}}(o)$                                                                                                                                                                                                                                                                                                                                                                                                                               |
| jcc f                                   | if $\hat{\mathbb{F}}(cc)$ then $(\mathbb{H}, \mathbb{R})f$ else $(\mathbb{H}, \mathbb{R})npc$<br>where suppose $\mathbb{F} = \{cf \rightsquigarrow cf, zf \rightsquigarrow zf\}$ , we have<br>$\hat{\mathbb{F}}(a) \triangleq \neg cf \wedge \neg zf$ , $\hat{\mathbb{F}}(ae) \triangleq \neg cf$ , $\hat{\mathbb{F}}(b) \triangleq cf$ ,<br>$\hat{\mathbb{F}}(be) \triangleq cf \vee zf$ , $\hat{\mathbb{F}}(e) \triangleq zf$ , $\hat{\mathbb{F}}(ne) \triangleq \neg zf$ |
| call o                                  | $(\text{Next}_{\text{push } npc}(\mathbb{H}, \mathbb{R}, \mathbb{F}), \hat{\mathbb{R}}(o))$                                                                                                                                                                                                                                                                                                                                                                                 |
| ret                                     | $(\text{Next}_{\text{pop}}(\mathbb{H}, \mathbb{R}, \mathbb{F}), \mathbb{H}(sp))$ when $sp \in \text{dom}(\mathbb{H})$                                                                                                                                                                                                                                                                                                                                                       |
| c                                       | $(\text{Next}_c(\mathbb{H}, \mathbb{R}, \mathbb{F}), npc)$                                                                                                                                                                                                                                                                                                                                                                                                                  |

where

|            |                                                                                                                                                                                                       |
|------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| if $c =$   | then $\text{Next}_c(\mathbb{H}, \mathbb{R}, \mathbb{F}) =$                                                                                                                                            |
| add r, o   | $(\mathbb{H}, \mathbb{R}\{r \rightsquigarrow \mathbb{R}(r) + \hat{\mathbb{R}}(o)\}, \text{CalcF}(\mathbb{R}(r) + \hat{\mathbb{R}}(o)))$<br>when $0 \leq \mathbb{R}(r) + \hat{\mathbb{R}}(o) < 2^{32}$ |
| sub r, o   | $(\mathbb{H}, \mathbb{R}\{r \rightsquigarrow \mathbb{R}(r) - \hat{\mathbb{R}}(o)\}, \text{CalcF}(\mathbb{R}(r) - \hat{\mathbb{R}}(o)))$<br>when $0 \leq \mathbb{R}(r) - \hat{\mathbb{R}}(o) < 2^{32}$ |
| cmp r, o   | $(\mathbb{H}, \mathbb{R}, \text{CalcF}(\mathbb{R}(r) - \hat{\mathbb{R}}(o)))$                                                                                                                         |
| mov r, o   | $(\mathbb{H}, \mathbb{R}\{r \rightsquigarrow \hat{\mathbb{R}}(o)\}, \mathbb{F})$                                                                                                                      |
| mov r, [d] | $(\mathbb{H}, \mathbb{R}\{r \rightsquigarrow \mathbb{H}(\hat{\mathbb{R}}(d))\}, \mathbb{F})$ when $\hat{\mathbb{R}}(d) \in \text{dom}(\mathbb{H})$                                                    |
| mov [d], o | $(\mathbb{H}\{\hat{\mathbb{R}}(d) \rightsquigarrow \hat{\mathbb{R}}(o)\}, \mathbb{R}, \mathbb{F})$ when $\hat{\mathbb{R}}(d) \in \text{dom}(\mathbb{H})$                                              |
| push o     | $(\mathbb{H}\{sp - 4 \rightsquigarrow \hat{\mathbb{R}}(o)\}, \mathbb{R}\{\text{esp} \rightsquigarrow sp - 4\}, \mathbb{F})$<br>when $sp - 4 \in \text{dom}(\mathbb{H})$                               |
| pop r      | $(\mathbb{H}, \mathbb{R}\{r \rightsquigarrow \mathbb{H}(sp)\}\{\text{esp} \rightsquigarrow sp + 4\}, \mathbb{F})$<br>when $sp \in \text{dom}(\mathbb{H})$ and $0 \leq sp + 4 < 2^{32}$                |
| pop        | $(\mathbb{H}, \mathbb{R}\{\text{esp} \rightsquigarrow sp + 4\}, \mathbb{F})$<br>when $0 \leq sp + 4 < 2^{32}$                                                                                         |

$Dc()$  is the instruction decoding function

$$\hat{\mathbb{R}}(i) \triangleq i \quad \hat{\mathbb{R}}(r) \triangleq \mathbb{R}(r) \quad \hat{\mathbb{R}}(r \pm i) \triangleq \mathbb{R}(r) \pm i$$

$$\text{CalcF}(i) \triangleq \{cf \mapsto i < 0, zf \mapsto i = 0\}$$

Figure 6.3: Dynamic semantics of Mini86

The `call` instruction pushes the return address (calculated from `pc`) onto the stack and transfers the control to the callee (by updating `pc`). The `ret` instruction bump the stack pointer by 4 and transfer the control to the return address. Note that the `ret` instruction does not necessarily transfer the control back to the caller, as the program may modify the stack in arbitrary ways. Such maneuver is indeed commonly used in implementing thread primitives.

## 6.2 XCAP86: a Port of XCAP on Mini86

In this section we first introduce the XCAP86 language, where most of the constructs and rules are the same as XCAP. We then discuss how to abstract and reason about memory, stack, function call/return interfaces.

**Assertion language.** The syntax of XCAP86 is given in Figure 6.4. We abstract a code heap  $\mathbb{C}$  out of the actual memory  $\mathbb{H}$  of Mini86. Although the code heap is presented in the syntax as a mapping from code labels to instruction sequences, it is actually embedded in the memory by encoding instructions as machine words. XCAP86 includes features such as embedded code pointers, impredicative polymorphisms, and recursive specifications.

We present the validity rules of XCAP86 extended propositions in Figure 6.5. Following XCAP, the interpretation of extended propositions is defined as their validity under the empty environment:

$$[\![P]\!]_{\Psi} \triangleq \cdot \vdash_{\Psi} P$$

Following XCAP, we establish the following soundness theorem of the interpretation of extended propositions (with respect to CiC/Coq).

### Theorem 6.1 (Soundness of XCAP86 *PropX* Interpretation)

1. If  $[\![\langle p \rangle]\!]_{\Psi}$  then  $p$ ;
2. if  $[\![\text{cptr}(f, a)]]\!]_{\Psi}$  then  $\Psi(f) = a$ ;

$(CodeHeap) \quad \mathbb{C} \quad ::= \{f \rightsquigarrow \mathbb{I}\}^*$   
 $(InstrSeq) \quad \mathbb{I} \quad ::= c \mid c;[f] \mid c;\mathbb{I}$   
 $(PropX) \quad P, Q \quad ::= \langle p \rangle \quad \text{lifted meta proposition}$   
 $\quad \quad \quad \mid \text{cptr}(f, a) \quad \text{embedded code pointer}$   
 $\quad \quad \quad \mid P \wedge Q \quad \text{conjunction}$   
 $\quad \quad \quad \mid P \vee Q \quad \text{disjunction}$   
 $\quad \quad \quad \mid P \rightarrow Q \quad \text{implication}$   
 $\quad \quad \quad \mid \forall x:A.P \quad \text{universal quantification}$   
 $\quad \quad \quad \mid \exists x:A.P \quad \text{existential quantification}$   
 $\quad \quad \quad \mid \forall a:A \rightarrow PropX.P \quad \text{imp. universal quan.}$   
 $\quad \quad \quad \mid \exists a:A \rightarrow PropX.P \quad \text{imp. existential quan.}$   
 $\quad \quad \quad \mid (\mu \alpha.\lambda x:A.P \ B) \quad \text{recursive definition}$   
 $(CdHpSpec) \quad \Psi \quad ::= \{f \rightsquigarrow a\}^*$   
 $(Assertion) \quad a \quad \in \quad \text{State} \rightarrow PropX$   
 $(AssertImp) \quad a \Rightarrow a' \triangleq \forall \Psi, \mathbb{S}. \llbracket a \rrbracket_\Psi \mathbb{S} \supset \llbracket a' \rrbracket_\Psi \mathbb{S}$   
 $(StepImp) \quad a \Rightarrow_C a' \triangleq \forall \Psi, \mathbb{S}. \llbracket a \rrbracket_\Psi \mathbb{S} \supset \llbracket a' \rrbracket_\Psi \text{Next}_C(\mathbb{S})$

Figure 6.4: Syntax of XCAP86

$\boxed{\Gamma \vdash_{\Psi} P}$

**(Validity of Extended Propositions)**

$(env) \quad \Gamma := \cdot \mid \Gamma, P$

*(The following presentation omits the  $\Psi$  in judgment  $\Gamma \vdash_{\Psi} P$ .)*

$$\begin{array}{c}
 \frac{P \in \Gamma}{\Gamma \vdash P} \text{ (ENV)} \quad \frac{p}{\Gamma \vdash \langle p \rangle} \text{ (}\langle \rangle\text{-I)} \quad \frac{\Gamma \vdash \langle p \rangle \quad p \supset (\Gamma \vdash Q)}{\Gamma \vdash Q} \text{ (}\langle \rangle\text{-E)} \\
 \\ 
 \frac{\Psi(f) = a}{\Gamma \vdash \text{cptr}(f, a)} \text{ (CP-I)} \quad \frac{\Gamma \vdash \text{cptr}(f, a) \quad (\Psi(f) = a) \supset (\Gamma \vdash Q)}{\Gamma \vdash Q} \text{ (CP-E)} \\
 \\ 
 \frac{\Gamma \vdash P \quad \Gamma \vdash Q}{\Gamma \vdash P \wedge Q} \text{ (\wedge-I)} \quad \frac{\Gamma \vdash P \wedge Q}{\Gamma \vdash P} \text{ (\wedge-E1)} \quad \frac{\Gamma \vdash P \wedge Q}{\Gamma \vdash Q} \text{ (\wedge-E2)} \\
 \\ 
 \frac{\Gamma \vdash P}{\Gamma \vdash P \vee Q} \text{ (\vee-I1)} \quad \frac{\Gamma \vdash Q}{\Gamma \vdash P \vee Q} \text{ (\vee-I2)} \quad \frac{\Gamma \vdash P \vee Q \quad \Gamma, P \vdash R \quad \Gamma, Q \vdash R}{\Gamma \vdash R} \text{ (\vee-E)} \\
 \\ 
 \frac{\Gamma, P \vdash Q}{\Gamma \vdash P \rightarrow Q} \text{ (}\rightarrow\text{-I)} \quad \frac{\Gamma \vdash P \rightarrow Q \quad \Gamma \vdash P}{\Gamma \vdash Q} \text{ (}\rightarrow\text{-E)} \\
 \\ 
 \frac{\Gamma \vdash P[B/x] \quad \forall B:A}{\Gamma \vdash \forall x:A. P} \text{ (\forall-I1)} \quad \frac{\Gamma \vdash \forall x:A. P \quad B:A}{\Gamma \vdash P[B/x]} \text{ (\forall-E1)} \\
 \\ 
 \frac{B:A \quad \Gamma \vdash P[B/x]}{\Gamma \vdash \exists x:A. P} \text{ (\exists-I1)} \quad \frac{\Gamma \vdash \exists x:A. P \quad \Gamma, P[B/x] \vdash Q \quad \forall B:A}{\Gamma \vdash Q} \text{ (\exists-E1)} \\
 \\ 
 \frac{\Gamma \vdash P[a/\alpha] \quad \forall a:A \rightarrow \text{Prop} X}{\Gamma \vdash \forall \alpha:A \rightarrow \text{Prop} X. P} \text{ (\forall-I2)} \quad \frac{a:A \rightarrow \text{Prop} X \quad \Gamma \vdash P[a/\alpha]}{\Gamma \vdash \exists \alpha:A \rightarrow \text{Prop} X. P} \text{ (\exists-I2)} \\
 \\ 
 \frac{B:A \quad \Gamma \vdash P[B/x][\mu \alpha:A \rightarrow \text{Prop} X. \lambda x:A. P/\alpha]}{\Gamma \vdash (\mu \alpha:A \rightarrow \text{Prop} X. \lambda x:A. P \ B)} \text{ (\mu-I)}
 \end{array}$$

Figure 6.5: Validity rules of XCAP86 extended propositions

3. if  $\llbracket P \wedge Q \rrbracket_\Psi$  then  $\llbracket P \rrbracket_\Psi$  and  $\llbracket Q \rrbracket_\Psi$ ;
4. if  $\llbracket P \vee Q \rrbracket_\Psi$  then either  $\llbracket P \rrbracket_\Psi$  or  $\llbracket Q \rrbracket_\Psi$ ;
5. if  $\llbracket P \rightarrow Q \rrbracket_\Psi$  and  $\llbracket P \rrbracket_\Psi$  then  $\llbracket Q \rrbracket_\Psi$ ;
6. if  $\llbracket \forall x:A. P \rrbracket_\Psi$  and  $B:A$  then  $\llbracket P[B/x] \rrbracket_\Psi$ ;
7. if  $\llbracket \exists x:A. P \rrbracket_\Psi$  then there exists  $B:A$  such that  $\llbracket P[B/x] \rrbracket_\Psi$ ;
8. if  $\llbracket \forall \alpha:A \rightarrow \text{PropX}. P \rrbracket_\Psi$  and  $a:A \rightarrow \text{PropX}$  then  $\llbracket P[a/\alpha] \rrbracket_\Psi$ ;
9. if  $\llbracket \exists \alpha:A \rightarrow \text{PropX}. P \rrbracket_\Psi$  then there exists  $a:A \rightarrow \text{PropX}$  such that  $\llbracket P[a/\alpha] \rrbracket_\Psi$ ;
10. if  $\llbracket (\mu \alpha. \lambda x:A. P \ B) \rrbracket_\Psi$  then  $\llbracket P[B/x] \llbracket (\mu \alpha. \lambda x:A. P \ )/\alpha \rrbracket \rrbracket_\Psi$ .

**Corollary 6.2 (XCAP86 Consistency)**  $\llbracket \langle \text{False} \rangle \rrbracket_\Psi$  is not provable.

**Inference rules.** The major difference between XCAP86 and XCAP is on the inference rules, which is presented in Figure 6.6. In the top-level well-formed program rule,  $\text{DC}(\mathbb{C})$  is a predicate that establishes the proper instruction decoding relation between the code heap  $\mathbb{C}$  and the memory  $\mathbb{H}$ . Its implementation makes use of the single-instruction decoding function  $\text{Dc}()$  that appears in the previous section.  $\text{lookup}(\mathbb{C}, f, \mathbb{I})$  is a macro that checks whether the instruction sequence  $\mathbb{I}$  is inside code heap  $\mathbb{C}$  at location  $f$ .

Using `cptr`, XCAP86 supports the reasoning of embedded code pointers (ECP) in general. The idea can be naturally adapted according to the x86 additional instructions that use ECPs. In particular, there are now three new rules for the function call and return instructions of Mini86, assuming a built-in stack. A `call` instruction pushes a return address onto the stack and transfers the control to the target code. In our actual implementation, the return address is calculated from the `pc`. To avoid obfuscating the presentation, we used an explicit  $[f_{ret}]$  in the above Rule `CALLI`. This rule says, if  $a$  holds on the current state, then  $\Psi(f)$  holds on the updated state after executing the stack push. The Rule `RET`

|                                                                                                                                                                                                                                                                                                                                                                                          |                                                                |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|
| $\boxed{\Psi_G \vdash \{a\} \mathbb{P}}$                                                                                                                                                                                                                                                                                                                                                 | <b>(Well-formed Program)</b>                                   |
| $\Psi_G \vdash \mathbb{C} : \Psi_G \quad ((\text{DC}(\mathbb{C}) * \llbracket a \rrbracket_{\Psi_G}) \mathrel{\text{S}}) \quad \text{lookup}(\mathbb{C}, pc, \mathbb{I}) \quad \Psi_G \vdash \{a\} \mathbb{I}$                                                                                                                                                                           | $\frac{}{\Psi_G \vdash \{a\} (\mathbb{S}, pc)} \text{ (PROG)}$ |
| $\boxed{\Psi_{IN} \vdash \mathbb{C} : \Psi}$                                                                                                                                                                                                                                                                                                                                             | <b>(Well-formed Code Heap)</b>                                 |
| $\frac{\Psi_{IN} \vdash \{a_i\} \mathbb{I}_i \quad \forall f_i}{\Psi_{IN} \vdash \{f_1 \rightsquigarrow \mathbb{I}_1, \dots, f_n \rightsquigarrow \mathbb{I}_n\} : \{f_1 \rightsquigarrow a_1, \dots, f_n \rightsquigarrow a_n\}} \text{ (CDHP)}$                                                                                                                                        |                                                                |
| $\frac{\Psi_{IN1} \vdash \mathbb{C}_1 : \Psi_1 \quad \Psi_{IN2} \vdash \mathbb{C}_2 : \Psi_2 \quad \Psi_{IN1}(f) = \Psi_{IN2}(f) \quad \text{dom}(\mathbb{C}_1) \cap \text{dom}(\mathbb{C}_2) = \emptyset \quad \forall f \in \text{dom}(\Psi_{IN1}) \cap \text{dom}(\Psi_{IN2})}{\Psi_{IN1} \cup \Psi_{IN2} \vdash \mathbb{C}_1 \cup \mathbb{C}_2 : \Psi_1 \cup \Psi_2} \text{ (LINK)}$ |                                                                |
| $\boxed{\Psi \vdash \{a\} \mathbb{I}}$                                                                                                                                                                                                                                                                                                                                                   | <b>(Well-formed Instruction Sequence)</b>                      |
| $\frac{a \Rightarrow_C a' \quad \Psi \vdash \{a'\} \mathbb{I} \quad c \in \{\text{add, sub, cmp, mov, push, pop}\}}{\Psi \vdash \{a\} c; \mathbb{I}} \text{ (SEQ)}$                                                                                                                                                                                                                      |                                                                |
| $\frac{a \Rightarrow \Psi(f) \quad f \in \text{dom}(\Psi)}{\Psi \vdash \{a\} \text{jmp } f} \text{ (JMPI)}$                                                                                                                                                                                                                                                                              |                                                                |
| $\frac{(\lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \langle \neg \hat{F}(cc) \rangle \wedge a(\mathbb{H}, \mathbb{R}, \mathbb{F})) \Rightarrow a' \quad \Psi \vdash \{a'\} \mathbb{I}}{(\lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \langle \hat{F}(cc) \rangle \wedge a(\mathbb{H}, \mathbb{R}, \mathbb{F})) \Rightarrow \Psi(f) \quad f \in \text{dom}(\Psi)} \text{ (JCC)}$         |                                                                |
| $\frac{a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \text{cptr}(\mathbb{R}(r), a') \quad a \Rightarrow a'}{\Psi \vdash \{a\} \text{jmp } r} \text{ (JMPR)}$                                                                                                                                                                                                                |                                                                |
| $\frac{(\lambda \mathbb{S}. \text{cptr}(f, \Psi(f)) \wedge a \mathbb{S}) \Rightarrow a' \quad f \in \text{dom}(\Psi) \quad \Psi \vdash \{a'\} \mathbb{I}}{\Psi \vdash \{a\} \mathbb{I}} \text{ (ECP)}$                                                                                                                                                                                   |                                                                |
| $\frac{a \Rightarrow_{\text{push}} f_{\text{ret}} \Psi(f) \quad f \in \text{dom}(\Psi)}{\Psi \vdash \{a\} \text{call } f; [f_{\text{ret}}]} \text{ (CALLI)}$                                                                                                                                                                                                                             |                                                                |
| $\frac{a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \text{cptr}(\mathbb{R}(r), a') \quad a \Rightarrow_{\text{push}} f_{\text{ret}} a'}{\Psi \vdash \{a\} \text{call } r; [f_{\text{ret}}]} \text{ (CALLR)}$                                                                                                                                                               |                                                                |
| $\frac{a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \text{cptr}(\mathbb{H}(\mathbb{R}(\text{esp})), a') \quad a \Rightarrow_{\text{pop}} a'}{\Psi \vdash \{a\} \text{ ret}} \text{ (RET)}$                                                                                                                                                                                 |                                                                |

Figure 6.6: Inference rules of XCAP86

says, the top of the stack is a code pointer with pre-condition  $a'$  and, if  $a$  holds on the current state,  $a'$  holds on the updated state after popping out the return address.

It is worth noting that Rule CALLI does not enforce the validity of the return address. This allows some “fake” function calls that never return, a pattern indeed used in the thread library in the next chapter. Specialized derived rules can be built as lemmas to reflect regular function calls.

**Soundness.** Following XCAP, we establish the soundness of these inferences rules with respect to the Mini86 operational semantics.

#### Lemma 6.3 (XCAP86 Progress)

If  $\Psi_G \vdash \{a\} \mathbb{P}$ , then there exists a program  $\mathbb{P}'$  such that  $\mathbb{P} \mapsto \mathbb{P}'$ .

#### Lemma 6.4 (XCAP86 Preservation)

If  $\Psi_G \vdash \{a\} \mathbb{P}$  and  $\mathbb{P} \mapsto \mathbb{P}'$  then there exists an assertion  $a'$  such that  $\Psi_G \vdash \{a'\} \mathbb{P}'$ .

#### Theorem 6.5 (XCAP86 Soundness)

If  $\Psi_G \vdash \{a\} \mathbb{P}$ , then for all natural number  $n$ , there exists a program  $\mathbb{P}'$  such that  $\mathbb{P} \multimap^n \mathbb{P}'$ .

For the presentation of the rest of this chapter and the next chapter, we ignore the difference between *Prop* and *PropX*, and always use the syntax of *Prop* terms for *PropX* terms and  $\mu$  for  $\mu$ .

**Reasoning about memory.** The reasoning on memory (data heap and stack) operations is largely carried out following separation logic [48, 53]. In particular, the primitives of separation logic are defined as shorthands using the primitives of the underlying logic in XCAP (a shallow embedding of separation logic in the assertion language). Some representative cases are given as follows.

$$\begin{aligned}
\mathsf{emp} &\triangleq \lambda \mathbb{H}. \mathbb{H} = \{\} \\
\mathsf{l} \mapsto \mathsf{w} &\triangleq \lambda \mathbb{H}. \mathsf{l} \neq \mathsf{NULL} \wedge \mathbb{H} = \{\mathsf{l} \rightsquigarrow \mathsf{w}\} \\
\mathsf{l} \mapsto \_ &\triangleq \lambda \mathbb{H}. \exists \mathsf{w}. (\mathsf{l} \mapsto \mathsf{w} \mathbb{H}) \\
\mathsf{a}_1 * \mathsf{a}_2 &\triangleq \lambda \mathbb{H}. \exists \mathbb{H}_1, \mathbb{H}_2. \mathbb{H}_1 \uplus \mathbb{H}_2 = \mathbb{H} \wedge \mathsf{a}_1 \mathbb{H}_1 \wedge \mathsf{a}_2 \mathbb{H}_2 \\
\mathsf{l} \mapsto \mathsf{w}_1, \dots, \mathsf{w}_n &\triangleq \mathsf{l} \mapsto \mathsf{w}_1 * \mathsf{l} + 4 \mapsto \mathsf{w}_2 * \dots * \mathsf{l} + 4(n-1) \mapsto \mathsf{w}_n \\
\mathsf{l} \mapsto [n] &\triangleq \mathsf{l} \mapsto \_, \dots, \_ \quad (\text{the number of } \_ \text{ is } n/4)
\end{aligned}$$

$\mathsf{emp}$  asserts that the memory is empty.  $\mathsf{l} \mapsto \mathsf{w}$  asserts that the memory contains exactly one word at address  $\mathsf{l}$  with the value  $\mathsf{w}$ ; when the value is not important, a wildcard is used as in  $\mathsf{l} \mapsto \_$ . Separating conjunction  $\mathsf{a} * \mathsf{a}'$  asserts that the memory can be split into two disjoint parts in which  $\mathsf{a}$  and  $\mathsf{a}'$  hold respectively ( $\uplus$  represents disjoint union).  $\mathsf{l} \mapsto \mathsf{w}_1, \dots, \mathsf{w}_n$  asserts that  $\mathsf{l}$  is the starting address of a sequence of words  $\mathsf{w}_1, \dots, \mathsf{w}_n$ ; our memory is addressed by bytes, hence the number 4 (one word is 4 bytes). Finally,  $\mathsf{l} \mapsto [n]$  means that  $\mathsf{l}$  points to  $n$  bytes. For conciseness, we sometimes omit lambda and existential bindings of variables, thus writing  $\lambda x_1, \dots, x_n. \exists y_1, \dots, y_m. \mathsf{P}$  simply as  $\mathsf{P}$  when there is no confusion. We also sometimes simplify  $(\mathsf{a} * \mathsf{a}' \mathbb{H})$  as  $\mathsf{a} * \mathsf{a}'$ .

**Stack and calling convention.** Besides specializing XCAP for our machine model, we also built key abstractions to help manage the complexity of the reasoning, such as the handling of the stack and calling convention.

Most Mini86 code follows the convention illustrated in Figure 6.7. The return address is on stack top  $[\mathsf{esp}]$ . Function arguments follow in  $[\mathsf{esp} + 4], [\mathsf{esp} + 8], \dots$ . The return value is stored in  $\mathsf{eax}$ . Unless specified otherwise,  $\mathsf{ebx}, \mathsf{esi}, \mathsf{edi}$  and  $\mathsf{ebp}$  are callee-save registers.

It is therefore convenient to built a specification template reflecting this convention. For a function with  $n$  arguments  $a_1 \dots a_n$ , we write its specification (*i.e.*, a pre-condition in the form of an assertion) as:

$$\mathsf{Fn} \ a_1, \dots, a_n \ \{\mathsf{Aux} : x_1, \dots, x_m; \mathsf{Local} : [fs]; \mathsf{Pre} : \mathsf{a}_{pre}; \mathsf{Post} : \mathsf{a}_{post}\}$$

The intention of this macro is that  $x_1, \dots, x_m$  are “auxiliary variables” commonly used in



Figure 6.7: Function calling convention

Hoare-logic style reasoning,  $fs$  is the size of required free space on the stack, and  $a_{pre}$  and  $a_{post}$  are the pre- and post-conditions of the function. We sometimes refer to the variables collectively as  $\vec{a}$  and  $\vec{x}$ . This macro is defined as follows:

$$\begin{aligned}
 & \exists a_1, \dots, a_n, cs, sp, ss, ret, x_1, \dots, x_m, a_{prv}. \\
 & \quad \text{reg}(cs, sp) \wedge ss \geq fs \\
 & \quad \wedge \text{stack}(sp, ss, ret, a_1, \dots, a_n) * a_{prv} * a_{pre} \\
 & \quad \wedge \text{cptr}(ret, \exists retv. \text{reg}(cs, sp+4) \wedge \text{eax} = retv) \\
 & \quad \wedge \text{stack}(sp+4, ss+4, a_1, \dots, a_n) * a_{prv} * a_{post}
 \end{aligned}$$

The first line of this definition quantifies over the values of (1) function arguments  $a_1, \dots, a_n$ , (2) callee-save registers  $cs$  (a 4-tuple), (3) the stack pointer  $sp$ , (4) the size of available space on stack  $ss$ , (5) the return address  $ret$ , (6) auxiliary variables  $x_1, \dots, x_m$ , and (7) some hidden private data expressed as the predicate  $a_{prv}$ .

The second line relates the register file with the callee-save values  $cs$  (4-tuple) and the stack pointer  $sp$  using the macro below. It also makes sure that there is enough space available on the stack.

$$\text{reg}(ebx, esi, edi, ebp, esp) \triangleq ebx = ebx \wedge esi = esi \wedge edi = edi \wedge ebp = ebp \wedge esp = esp$$

The third line describes (1) the stack frame upon entering the function using the macro

below, (2) the private data hidden from the function, and (3) the user customized precondition  $\mathbf{a}_{pre}$ , which does not directly talk about register files and the current stack frame, since they have already been handled by the calling convention.

$$\text{stack}(sp, ss, w_1, \dots, w_n) \triangleq sp - ss \mapsto [ss] * sp \mapsto w_1, \dots, w_n.$$

The last two lines of the  $\text{Fn}$  definition specify the return address  $ret$  as an embedded code pointer using  $\text{cptr}$ . When a function returns, the callee-save registers, stack frame, and private data must all be preserved, and the post-condition  $\mathbf{a}_{post}$  must be established. Note that (1)  $\mathbf{eax}$  may contain a return value  $retv$ , and (2) the return instruction automatically increases the stack pointer by 4.

We built another abstractions to help manage the complexity of verification. It is a finer-grained version of  $\text{Fn}$  which, besides following the convention above, also specifies local variables on the stack by extending the “Local :  $[fs]$ ;” part to “Local :  $[fs], v_1, \dots, v_k;$ ”, where  $v_i$  are the local values on the stack. The definition of this new  $\text{Fn}$  is:

$$\begin{aligned} & \exists a_1, \dots, a_n, v_1, \dots, v_k, cs, sp, ss, ret, x_1, \dots, x_m, \mathbf{a}_{prv}. \\ & \quad \text{reg}(cs, sp) \wedge ss \geq fs \\ & \quad \wedge \text{stack}(sp, ss, v_1, \dots, v_k, ret, a_1, \dots, a_n) * \mathbf{a}_{prv} * \mathbf{a}_{pre} \\ & \quad \wedge \text{cptr}(ret, \exists retv. \text{reg}(cs, sp+4) \wedge \mathbf{eax} = retv \\ & \quad \wedge \text{stack}(sp+4+4k, ss+4+4k, a_1, \dots, a_n) * \mathbf{a}_{prv} * \mathbf{a}_{post} \end{aligned}$$

We define many derived rules (implemented as lemmas) to facilitate the reasoning related to functions specified using the new  $\text{Fn}$ . We provide the following examples:

$$\frac{\mathbf{a}_{pre} \Rightarrow_c a \quad \Psi \vdash \{\text{Fn } \vec{a} \{ \text{Aux} : \vec{x}; \text{Local} : [fs], \vec{v}; \text{Pre} : a; \text{Post} : \mathbf{a}_{post} \} \} \mathbb{I}}{\Psi \vdash \{\text{Fn } \vec{a} \{ \text{Aux} : \vec{x}; \text{Local} : [fs], \vec{v}; \text{Pre} : a_{pre}; \text{Post} : \mathbf{a}_{post} \} \} c; \mathbb{I}} \text{ (FN-SEQ)}$$

$$\frac{fs \geq 4 \quad \Psi \vdash \{\text{Fn } \vec{a} \{ \text{Aux} : \vec{x}; \text{Local} : [fs-4], w, \vec{v}; \text{Pre} : a_{pre}; \text{Post} : \mathbf{a}_{post} \} \} \mathbb{I}}{\Psi \vdash \{\text{Fn } \vec{a} \{ \text{Aux} : \vec{x}; \text{Local} : [fs], \vec{v}; \text{Pre} : a_{pre}; \text{Post} : \mathbf{a}_{post} \} \} \text{push } w; \mathbb{I}} \text{ (FN-PUSH)}$$

$$\begin{aligned}
\Psi(f) &= \text{Fn } \vec{a}' \{ \text{Aux} : \vec{x}'; \text{Local} : [fs']; \text{Pre} : a_{pre}'; \text{Post} : a_{post}' \} \\
\Psi(f_{ret}) &= \text{Fn } \vec{a} \{ \text{Aux} : \vec{x}; \text{Local} : [fs], \vec{v}; \text{Pre} : a; \text{Post} : a_{post} \} \\
a_{pre} \Rightarrow (a_{prv} * a_{pre}' \wedge (a_{prv} * a_{post}' \Rightarrow a)) \\
\frac{fs - 4 > fs' \quad f \in \text{dom}(\Psi) \quad f_{ret} \in \text{dom}(\Psi)}{\Psi \vdash \{ \text{Fn } \vec{a} \{ \text{Aux} : \vec{x}; \text{Local} : [fs], \vec{v}; \text{Pre} : a_{pre}; \text{Post} : a_{post} \} \} \text{ call } f; [f_{ret}]} \text{ (FN-CALLI)} \\
\frac{a_{pre} \Rightarrow a_{post}}{\Psi \vdash \{ \text{Fn } \vec{a} \{ \text{Aux} : \vec{x}; \text{Local} : [fs]; \text{Pre} : a_{pre}; \text{Post} : a_{post} \} \} \text{ ret}} \text{ (FN-RET)}
\end{aligned}$$

### 6.3 Verification of a polymorphic queue module

We demonstrate some of those macros defined in the previous section with the specification and verification of a polymorphic queue module, which will be used by the thread library in the next chapter. Our queue module's C specification as follows.

```

typedef struct node_st *node_t;
struct node_st node_t next;

void queue_insert (node_t *q, node_t t);
node_t queue_delete (node_t *q);

```

The modules defines a node data structure and functions for queue insertion and remove.

We define the XCAP86 data type for the queue noder pointer as follows:

$$\begin{aligned}
\text{node\_t\_ref}(a, \text{nil}, q) &\triangleq q \mapsto \text{NULL} \\
\text{node\_t\_ref}(a, t :: tl, q) &\triangleq q \mapsto t * a(t) * \text{node\_t\_ref}(a, tl, t)
\end{aligned}$$

This defines a data structure of polymorphic queues, where the link field of a queue node is always stored in the first word of the node. More specifically,  $\text{node\_t\_ref}(a, [t], q)$  represents that  $q$  is a pointer to a queue; the locations of the queue nodes are collected as a list  $[t]$ , and every queue node satisfies the predicate  $a$ . The definition is inductive. In the base case, an empty queue specifies just the queue pointer  $q$  itself, which points to  $\text{NULL}$ . In the inductive case, a queue comprises of a head node at  $t$  satisfying  $a$  and a tail queue of  $\text{node\_t\_ref}(a, tl, t)$ .

*queue\_insert* will insert a properly shaped node into the end of the queue. *queue\_delete()* returns NULL when the queue is empty, otherwise, it returns the first node of the queue and removes it from the queue. We present the code and verification of *queue\_insert()* and *queue\_delete()* in Figure 6.8 and Figure 6.9. In the figures, there are assembly code, comments, function interfaces, and assertions at each key program point.

As the next chapter will show, the polymorphic queue module can be used by multiple modules, which demonstrates the modularity of its specifications.

```

Fn q,t { Aux: a,tl; Local: [0]
  Pre: node_t_ref(a,tl,q) * t ↦ _ * a t;
  Post: node_t_ref(a,tl@[t],q) }

queue_insert: // void queue_insert (node_t *q, node_t t);
Local: [0]; Pre: node_t_ref(a,tl,q) * t ↦ _ * a(t)
  mov ecx, [esp+4]
  mov edx, [esp+8]
  mov [edx+_next], NULL // t->next = NULL;
  mov eax, [ecx] // node_t p = *q;
Local: [0]; Pre: node_t_ref(a,tl,q) * node_t_ref(a,[],t) * a t
  // eax = head(tl) ∧ ecx = q ∧ edx = t
  cmp eax, NULL // if (p == NULL) // add as first element
  jne qi_els
Local: [0]; Pre: node_t_ref(a,tl,q) * node_t_ref(a,[],t) * a t
  // eax = head(tl) = NULL ∧ ecx = q ∧ edx = t
  // unfold, cast
Local: [0]; Pre: q ↦ _ * a t * node_t_ref(a,[],t) ∧ ecx = q ∧ edx = t ∧ tl = []
  mov [ecx], edx // *q = t;
Local: [0]; Pre: q ↦ t * a t * node_t_ref(a,[],t) ∧ ecx = q ∧ edx = t ∧ tl = []
  // fold, cast
Local: [0]; Pre: node_t_ref(a,tl@[t],q)
  ret // return;

qi_els:
Local: [0]; Pre: node_t_ref(a,tl,q) * node_t_ref(a,[],t) * a t
  // eax = head(tl) ≠ NULL ∧ ecx = q ∧ edx = t
  // cast
Local: [0]; Pre: node_t_ref(a,tl,q) * node_t_ref(a,[],t) * a t
  // eax = last(tl1) ∧ edx = t ∧ tl = tl1@tl2
  mov ecx, eax // while (p->next != NULL) // insert at the tail
  mov eax, [eax+_next] // p = p->next;
Local: [0]; Pre: node_t_ref(a,tl,q) * node_t_ref(a,[],t) * a t
  // eax = head(tl2) ∧ ecx = last(tl1) ∧ edx = t ∧ tl = tl1@tl2
  cmp eax, NULL
  jne qi_els
Local: [0]; Pre: node_t_ref(a,tl,q) * node_t_ref(a,[],t) * a t
  // eax = head(tl2) = NULL ∧ ecx = last(tl1) ∧ edx = t ∧ tl = tl1@tl2
  // cast
Local: [0]; Pre: node_t_ref(a,tl,q) * node_t_ref(a,[],t) * a t ∧ ecx = last(tl) ∧ edx = t
  mov [ecx+_next], edx // p->next = t;
Local: [0]; Pre: node_t_ref(a,tl@[t],q)
  ret // return;

```

Figure 6.8: Verification of queue insertion

```

Fn q { Aux: tl, a; Local: [0]
  Pre: node_t_ref(a, tl, q);
  Post: node_t_ref(a, tl', q) ∧ ((retv=NULL ∧ tl=tl'=[]) ∧
                                 (retv≠NULL ∧ tl=retv::tl' ∧ retv→_* a retv)) }

queue_delete: // node_t queue_delete (node_t *q);

Local: [0]; Pre: node_t_ref(a, tl, q)
  mov ecx, [esp+4]
  mov eax, [ecx] // node_t t = *q;

Local: [0]; Pre: node_t_ref(a, tl, q) ∧ eax=head(tl) ∧ ecx=q
  cmp eax, NULL // if(t != NULL)
  je qd_ret

Local: [0]; Pre: node_t_ref(a, tl, q) ∧ eax=head(tl) ∧ ecx=q ∧ eax≠NULL
  // unfold, cast

Local: [0]; Pre: q→t * a t * node_t_ref(a, tl', t) ∧ eax=t ∧ ecx=q ∧ eax≠NULL ∧ tl=t::tl'
  mov edx, [eax+next] // *q = t->next;
  mov [ecx], edx

Local: [0]; Pre: t→_* a t * node_t_ref(a, tl', q) ∧ eax=t ∧ eax≠NULL ∧ tl=t::tl'
  ret // return t;

qd_ret:
Local: [0]; Pre: node_t_ref(a, tl, q) ∧ eax=head(tl) ∧ ecx=q ∧ eax=NULL
  ret // return t;

```

Figure 6.9: Verification of queue deletion



## Chapter 7

# A Certified Mini Thread Library

In this chapter, we describe the mechanized verification of a thread implementation at machine level using XCAP86. Using the first mechanized proof for the safety of a machine-level thread implementation, we want to demonstrate the power and practicality of the XCAP framework, and that the certification of complex machine-level system code is not beyond reach. Our mini thread library, MTH, written in Mini86, is divided into modules for polymorphic queue (presented in the previous chapter), memory management, machine context, and threading.

All MTH code are at the same realistic level as the context switching example, following the Windows/Unix style. MTH and its verification involve neither change of system programming style nor disposal of existing system code base. There is no performance penalty, nor is any compatibility issue.

The specifications and proof of MTH modules and routines are modular. Each piece of code is specified and proved with minimal reference to external code. For example, in the verification of context module, there is no mentioning of thread at all.

We first give an overview of MTH structure, threading model, and threading features. Then we present the verifications of the machine context module and the threading module independently.

## 7.1 MTH: A Mini Thread Library

MTH is a minimal thread library initially modeled after the GNU Portable Threads (Pth) [17], a portable POSIX/ANSI-C based library for Unix. MTH supports non-preemptive scheduling for cooperative multi-threading. While sharing the same address space, every thread has its own program counter, register set, and stack. For simplicity, MTH omits some advanced features from Pth, including priority-based scheduling and events and signals, and adopts some small changes in the library structures. Nonetheless, the handling of machine contexts and thread management reflects sophisticated invariants of multi-threading and suffices in demonstrating the difficulty of verifying machine-level thread implementation.

**MTH structures.** Figure 7.1 divides MTH into several modules. Thread primitives visible to the user are implemented in the *threading* module, which refers to three other modules: *machine context*, *memory* and *queue*. The *queue* module is the one discussed in the previous chapter. Pseudo-code interfaces of these modules are provided for the ease of understanding. In later sections, we will develop more accurate specifications for verification purposes.

The memory module is a miniature library for dynamic storage allocation. It provides routines for heap allocation and deallocation. The queue module supports insertion and removal of queue elements in the expected way. Its routines are polymorphic and can be applied to different types of queues, as long as every queue element stores its link pointer in its first word.

The machine context module deserves some attention. A machine context (type `mctx_st`) stores the values of general-purpose registers, where the stack pointer `esp` should point to a stack with a valid return address on top. Context switching `swapcontext()` saves the current context as `old`, and loads a new one from `new` for further execution. Context loading `loadcontext()` loads a new context and executes from there. Context creation

**machine context module**

```

typedef struct mctx_st *mctx_t;
struct mctx_st {int eax, int ebx, int ecx, int edx, int esi, int edi, int ebp, int esp};
void loadcontext (mctx_t mctx);
void makecontext (mctx_t mctx, char *sp, void *lInk, void *func, void *arg);
void swapcontext (mctx_t old, mctx_t new);

```

**queue module**

```

typedef struct node_st *node_t;
struct node_st {node_t next};
void queue_insert (node_t *q, node_t t);
node_t queue_delete (node_t *q);

```

**memory module**

```

void *flist = NULL;
void *malloc (int size);
void free (void *ptr);

```

**threading module**

```

enum mth_state_t {READY, DEAD};
typedef struct mth_st *mth_t;
struct mth_st {mth_t next, mth_state_t state, mctx_st mctx};
mth_t mth_current, mth_rq;
mctx_t mctx_sched;
void mth_start (int stacksize, void *(*main)(void *), void *arg);
void mth_spawn (int stacksize, void *(*func)(void *), void *arg);
void mth_exit (void *value);
void mth_yield (void);
void mth_scheduler (void);

```

Figure 7.1: Module structure and pseudo-C specification of MTH



Figure 7.2: Threading model of MTH

`makecontext()` initializes a new context based on its arguments (location of new context, stack pointer, return link, address of target function, and argument for it); the stack pointer of the new context points to a stack frame prepared for the target function.

**Threading model.** The threading module provides interfaces for all the “visible” functionalities of MTH. The threading model is illustrated in Figure 7.2. When a new thread is *spawned*, it is put into a ready queue. A central scheduler removes a thread from the ready queue and makes it the *current thread* using context switching. The current thread may *exit* execution and become dead or *yield* execution and be put back into the ready queue.

A user program starts multi-threading by calling `mh_start()`, which will set up the threading data structures, spawn the first user thread, and call `mh_scheduler()` to initiate the scheduling. All these happen on the statically allocated system stack, which is different from the thread stacks to be allocated on the heap. The scheduler is essentially a loop implementing simple FIFO scheduling.

The thread creation function `mh_spawn()` takes three arguments describing the new thread: the stack size, the location of the thread code, and the thread arguments. It allocates a thread stack and a thread control block (TCB), creates a machine context for the

thread, and puts the TCB into the ready queue. Once made current (or running), a thread’s execution will not be interrupted unless it explicitly yields control using *mth\_yield()*, which transfers the control to the scheduler using context switching. When the main function of a thread returns, *mth\_exit* is invoked to mark the thread as dead and yield control to the scheduler. Then, the scheduler will reclaim the resources and terminate the dead thread.

Machine-level thread implementations like MTH pose many challenges for verification. The complex and subtle behaviors of the routines are difficult to model. The control flow is complex with the extensive use of embedded code pointers stored in machine contexts and TCBs. Furthermore, the invariants maintained by the threads are recursive by nature, because the threads have mutual expectations from each other.

**The memory module.** The memory module is self-contained, as described by the specification below. It makes use of a free list (*flist*) of memory blocks implemented using the queue data type. Its verification is adapted from that of a malloc/free library by Yu *et al.* [63]. We omit the details and instead only point out that there is a memory invariant  $I_{mem}$  (essentially stating the existence of the free list) that must be maintained during the execution of client programs of this module. Its use will be illustrated in later sections.

$$mblk_t t \triangleq \exists size. t - 4 \mapsto size * t + 4 \mapsto [size - 4] \wedge size \geq 4$$

$$I_{mem} \triangleq \exists tl. \text{node\_t\_ref}(mblk_t, tl, flist)$$

$$malloc : \text{Fn } size \{ \text{Aux :: Local : [28];}$$

$$\text{Pre} : I_{mem};$$

$$\text{Post} : I_{mem} * \text{retv} - 4 \mapsto size * \text{retv} \mapsto [size] \}$$

$$free : \text{Fn } ptr \{ \text{Aux :: Local : [8];}$$

$$\text{Pre} : I_{mem} * \text{ptr} - 4 \mapsto size * \text{ptr} \mapsto [size];$$

$$\text{Post} : I_{mem} \}$$



Figure 7.3: Machine context

## 7.2 Verification of the Machine Context Module

Machine context is a primitive concept for low-level programming. It offers the basis for higher-order control-flow transfers, such as call/cc and threading. As is the case of the queue and memory modules, the machine context module is specified and verified separately without any knowledge about its client—the threading module.

**What is a machine context?** Although the pseudo-C specification in Figure 7.1 seems to indicate that a context is merely eight words, the actual assumptions behind it are rather complex. As illustrated in Figure 7.3, the eight words represent a return value *retv*, six registers referred to collectively as *cs*, (the *cs* convention for context is different than that of *sp*, there are six of them, instead of the normal four), and a stack pointer *sp*. *sp* points to a return address *ret* typically found on top of a stack frame. (A machine context does not really care about other part of the stack, or if there is a stack at all). The six saved registers may point to some private data. (which could contain the other part of the stack). There may also be some environment data shared with external code. Eventually, a context is consumed when invoked. A proper invocation by jumping to the return address *ret* requires (1) the saved register contents be restored into the register file, (2) the stack and private data be preserved, and (3) the shared environment be available. In other words, it is only safe to load a context when its global shared data is available.

All these requirements make it difficult to specify and verify the seemingly simple

manipulations on machine contexts. To facilitate the reasoning, we define a macro for the context data structure  $mctx.t(a_{env}, mctx)$ , parametric to the environment described as  $a_{env}$ .

$$\begin{aligned} \exists retv, cs, sp, ret, a_{priv}. \quad & mctx \mapsto retv, cs, sp * sp \mapsto ret * a_{priv} \\ \wedge \text{cptr}(ret, \text{reg}_6(cs, sp+4)) \wedge \text{eax} = retv \wedge a_{env} * mctx \mapsto retv, cs, sp * sp \mapsto ret * a_{priv}) \end{aligned}$$

where  $a_{priv}$  describes the private data, and  $\text{reg}_6$  describes the 6 saved registers and  $\text{esp}$ :

$$\begin{aligned} \text{reg}_6(ebx, ecx, edx, esi, edi, ebp, esp) \triangleq & \text{ebx} = ebx \wedge \text{ecx} = ecx \wedge \text{edx} = edx \\ \wedge \text{esi} = esi \wedge \text{edi} = edi \wedge \text{ebp} = ebp \wedge \text{esp} = esp \end{aligned}$$

**Context switching.**  $\text{swapcontext}()$  expects two pointers (old and new contexts) as arguments and performs three tasks: to save registers to old context, to load registers from new context, and to transfer control to new context. Observations from inside and outside of  $\text{swapcontext}()$  are very different. From the code of  $\text{swapcontext}()$ , it gets called by one client and “returns” to another. However, this is transparent to the clients—when  $\text{swapcontext}()$  returns to a client, the stack and private data of the client are kept intact.

The specification and proof outline of  $\text{swapcontext}()$  is given in Figure 7.4. It uses a macro  $\text{Fn}_6$ , a variant of  $\text{Fn}$  obtained by replacing  $\text{reg}$  with  $\text{reg}_6$  and changing  $cs$  from referring to 4 registers to 6. Similarly to  $\text{Fn}$ ,  $\text{Fn}_6$  helps to manage the preservation of the stack and private data. In addition, the pre-condition of the specification dictates the shape of three pieces of memory: (1) the old context pointed to by  $old$ —at the beginning of the routine it is simply 32 bytes of memory available for use, (2) the shared data  $a_{env}$ , and (3) the new context pointed to by  $new$ . The new context is specified with help of the macro  $mctx.t$ . The environment parameter of this macro consists of two parts: the shared data  $a_{env}$  and another  $mctx.t$  macro describing the old context. This is because the old context will be properly set up by the routine before switching to the new one. A tricky point is that the old context will be expecting a new (existentially quantified) shared environment  $a_{newenv}$ . Although some may expect the new environment to be simply the old shared  $a_{env}$  together with the new context at  $new$ , this may not necessarily be the case. For instance, the new context may not be still alive when the old context regains control. The post-condition of

`swapcontext()` is relatively simple: the space for the old context would be available together with the new shared data  $a_{newenv}$ .

An interesting proof step is the one after the old context is packed but before the new one is unpacked. At this point, there is no direct notion of stack or function. The relevant machine state essentially comprises of two contexts and one environment:

$$\begin{aligned} \mathbf{eax} = \mathbf{new} \wedge & \quad \mathbf{mctx.t(a}_{newenv}, \mathbf{old}) * \mathbf{a}_{env} \\ & * \mathbf{mctx.t(mctx.t(a}_{newenv}, \mathbf{old}) * \mathbf{a}_{env}, new) } \end{aligned}$$

It is therefore safe to load the new context from `eax` and switch to it. More proof steps are available in Figure 7.4.

**Context loading.** `loadcontext()` essentially performs half of the tasks of `swapcontext()`, with some slight difference in the stack layout. Although we call it a “function”, it actually never returns and does not require the stack top to be a valid return address. Instead of using `Fn` macro, we only write the following as its complete pre-condition. We present the verification of `loadcontext()` in Figure 7.5.

**Context creation.** We present the specification and proof outline of `makecontext()` in Figure 7.6. The intermediate assertions are organized using the `Fn` macro. For conciseness, we omitted common parts of these macros, thus emphasizing only the changing parts: the stack frame `Local` and the current pre-condition `Pre`.

The pre-condition of the routine specifies (1) an empty context at `mctx`, (2) a stack `nsp` with available space of size `nss`, (3) some private data of the target context (potentially accessible from the argument `arg` of the function `func()` of the target context, (4) a link (return address) `lnk` to be used when the target context finishes execution, and (5) a function pointer `func()` for the code to be executed in the target context. It also specifies the exact requirements on the code at pointers `lnk` and `func`: (1) `aret` occurs both in the post-condition of `func()` and in the pre-condition of `lnk`, indicating that the code at the return address `lnk` may expect some results from the context function `func()`; (2) the stack should

```

Fn6 old,new { Aux: anewenv; Local: [0];
  Pre: old ↦ [32] * aenv * mctx_t(mctx_t(anewenv, old) * aenv, new);
  Post: old ↦ [32] * anewenv ∧ eax = 0 }

swapcontext: // void swapcontext(mctx_t old, mctx_t new);
  reg6(cs, sp) ∧ ss ≥ 0 ∧ stack(sp, ss, ret, old, new) * old ↦ [32] * aprv * aenv
  * mctx_t(mctx_t(anewenv, old) * aenv, new)
  ∧ cptr(ret, reg6(cs, sp + 4)) ∧ eax = 0 ∧ stack(sp + 4, ss + 4, old, new) * old ↦ [32] * aprv * anewenv)
    mov eax, [esp + 4] // load address of the context data structure we save in
    mov [eax + _eax], 0 // all registers preserved except eax
    mov [eax + _ebx], ebx
    mov [eax + _ecx], ecx
    mov [eax + _edx], edx
    mov [eax + _esi], esi
    mov [eax + _edi], edi
    mov [eax + _ebp], ebp
    mov [eax + _esp], esp
    mov eax, [esp + 8] // load address of the context data structure we have to load
    eax = new ∧ ss ≥ 0 ∧ stack(sp, ss, ret, old, new) * old ↦ 0, cs, sp * aprv * aenv
    * mctx_t(mctx_t(anewenv, old) * aenv, new)
    ∧ cptr(ret, reg6(cs, sp + 4)) ∧ eax = 0 ∧ stack(sp + 4, ss + 4, old, new) * old ↦ [32] * aprv * anewenv)
      // shuffle and cast
      eax = new ∧ old ↦ 0, cs, sp * sp ↦ ret * stack(sp, ss) * sp + 4 ↦ old, new * aprv
      ∧ cptr(ret, reg6(cs, sp + 4))
      ∧ eax = 0 ∧ anewenv * old ↦ 0, cs, sp * sp ↦ ret * stack(sp, ss) * sp + 4 ↦ old, new * aprv
      * aenv * mctx_t(mctx_t(anewenv, old) * aenv, new)
      // pack old context
      eax = new ∧ mctx_t(anewenv, old) * aenv * mctx_t(mctx_t(anewenv, old) * aenv, new)
      // unpack new context
      eax = new ∧ mctx_t(anewenv, old) * aenv * new ↦ retv', cs', sp' * sp' ↦ ret' * anewprv
      ∧ cptr(ret', reg6(cs', sp' + 4))
      ∧ eax = retv' ∧ mctx_t(anewenv, old) * aenv * new ↦ retv', cs', sp' * sp' ↦ ret' * anewprv)
      mov esp, [eax + _esp] // load the new stack pointer.
      mov ebp, [eax + _ebp]
      mov edi, [eax + _edi]
      mov esi, [eax + _esi]
      mov edx, [eax + _edx]
      mov ecx, [eax + _ecx]
      mov ebx, [eax + _ebx]
      mov eax, [eax + _eax]
      reg6(cs', sp') ∧ eax = retv' ∧ mctx_t(anewenv, old) * aenv * new ↦ retv', cs', sp' * sp' ↦ ret' * anewprv
      ∧ cptr(ret', reg6(cs', sp' + 4))
      ∧ eax = retv' ∧ mctx_t(anewenv, old) * aenv * new ↦ retv', cs', sp' * sp' ↦ ret' * anewprv)
      ret

```

Figure 7.4: Verification of machine context switching

```

loadcontext:           // void loadcontext (mctx_t mctx);
reg(cs,sp) ∧ stack(sp,ss,ret,mctx) * aenv * mctx_t(stack(sp,ss,ret,mctx) * aenv,mctx)
    mov eax, [esp+4]
eax = mctx ∧ stack(sp,ss,ret,mctx) * aenv * mctx_t(stack(sp,ss,ret,mctx) * aenv,mctx)
                                // unpack context
eax = mctx
                                ∧ anewenv * mctx_t(anewenv,mctx)
                                // unpack context
eax = mctx
                                ∧ anewenv * mctx ↦ retv', cs', sp' * sp' ↦ ret' * aprv
∧ cptr(ret', reg6(cs', sp'+4) ∧ eax = retv' ∧ anewenv * mctx ↦ retv', cs', sp' * sp' ↦ ret' * aprv)
    mov esp, [eax+_esp] // load the new stack pointer.
    mov ebp, [eax+_ebp]
    mov edi, [eax+_edi]
    mov esi, [eax+_esi]
    mov edx, [eax+_edx]
    mov ecx, [eax+_ecx]
    mov ebx, [eax+_ebx]
    mov eax, [eax+_eax]
    reg6(cs', sp'+4) ∧ eax = retv' ∧ anewenv * mctx ↦ retv', cs', sp' * sp' ↦ ret' * aprv
    ∧ cptr(ret', reg6(cs', sp'+4) ∧ eax = retv' ∧ anewenv * mctx ↦ retv', cs', sp' * sp' ↦ ret' * aprv)
    ret

```

Figure 7.5: Verification of machine context loading

```

Fn mctx, nsp, lnk, func, arg { Aux: aenv; Local: [0];
  Pre: mctx ↦ [32] * stack(nsp, nss) * aprv ∧ nss ≥ 12
    ∧ cptr(lnk, esp = nsp - 4 ∧ stack(nsp - 4, nss - 4, arg) * aret)
    ∧ cptr(func, Fn arg' { Local: nss - 8;
      Pre: sp' = nsp - 8 ∧ arg' = arg ∧ aenv * mctx ↦ [32] * aprv;
      Post: aret });
  Post: mctx_t(aenv, mctx) }

makecontext: // void makecontext (mctx_t mctx, char *sp, void *lnk, void *func, void *arg);
Local: [0]; Pre: mctx ↦ [32] * stack(nsp, nss) * aprv ∧ nss ≥ 12
  ∧ cptr(lnk, esp = nsp - 4 ∧ stack(nsp - 4, nss - 4, arg) * aret)
  ∧ cptr(func, Fn arg' { Local: nss - 8;
    Pre: sp' = nsp - 8 ∧ arg' = arg ∧ aenv * mctx ↦ [32] * aprv;
    Post: aret });

  mov eax, [esp+4] // load address of the context data structure.
  mov ecx, [esp+8] // load new stack top
  mov edx, [esp+20] // push the function's argument into new stack
  mov [ecx-4], edx
  mov edx, [esp+12] // push link (return address for the function) into new stack
  mov [ecx-8], edx
  mov edx, [esp+16] // push the function as return IP into new stack
  mov [ecx-12], edx
  sub ecx, 12
  mov [eax+esp], ecx // only the stack pointer matters for a fresh new context

Local: [0]; Pre: mctx ↦ [28], nsp - 12 * stack(nsp - 12, nss - 12, func, lnk, arg) * aprv
  ∧ cptr(lnk, esp = nsp - 4 ∧ stack(nsp - 4, nss - 4, arg) * aret)
  ∧ cptr(func, Fn arg' { Local: nss - 8;
    Pre: sp' = nsp - 8 ∧ arg' = arg ∧ aenv * mctx ↦ [32] * aprv;
    Post: aret });

  // unfold, shuffle, and cast

Local: [0]; Pre:
  mctx ↦ retv', cs', nsp - 12 * nsp - 12 ↦ func
  * stack(nsp - 12, nss - 12) * nsp - 8 ↦ lnk, arg * aprv
  ∧ cptr(lnk, esp = nsp - 4 ∧ stack(nsp - 4, nss - 4, arg) * aret)
  ∧ cptr(func, reg6(cs', nsp - 8) ∧ eax = retv' ∧ aenv * mctx ↦ retv', cs', nsp - 12 * nsp - 12 ↦ func
    * stack(nsp - 12, nss - 12) * nsp - 8 ↦ lnk, arg * aprv
    ∧ cptr(lnk, esp = nsp - 4 ∧ stack(nsp - 4, nss - 4, arg) * aret));
  // pack the fresh context

Local: [0]; Pre: mctx_t(aenv, mctx);
  ret

```

Figure 7.6: Verification of machine context creation



Figure 7.7: Thread control blocks and queues of MTH

be properly maintained upon returning to *lnk*.

The post-condition of the routine simply states that, when *makecontext()* returns, *mctx* will point to a proper context where the shared environment is *a<sub>env</sub>*.

Our interface of *makecontext()* is faithful to the Unix/Linux implementations. The heavy usage of embedded code pointers and function pointers in this case shows how crucial XCAP’s support of ECP is. As the proof shows, the most complex step is again on casting code pointers’ pre-conditions and pack all the resource into a complete context.

### 7.3 Verification of the Threading Module

With all the other modules verified, we are now ready to tackle the threading module.

**Thread control blocks and thread queues.** The main data structure for MTH threading is the thread control block (TCB). Figure 7.7 presents a queue of TCB’s. Every TCB comprises of three parts: (1) the first word is a link field for the queue, (2) the second word stores the status of the thread (READY or DEAD), and (3) a machine context is embedded in the next eight words. The TCB’s of all ready but non-running threads are put into a ready queue, implemented as an instance of the polymorphic queue introduced in Section 6.3.

$$\text{mth\_t\_ref}(st, a_{inv}, tl, q) \triangleq \text{node\_t\_ref}(\text{mth\_t}(st, a_{inv}), tl, q).$$

As a reminder, *node.t.ref* requires three parameters: a predicate describing the queue nodes, a list of node locations, and a queue pointer. *mth.t.ref* requires 4 parameters: (1)



Figure 7.8: Threading invariant of MTH

every thread in the queue has the same thread state  $st$ , (2) a certain threading invariant  $a_{inv}$  that must be established before any thread in the queue takes control, (3) a list  $tl$  of TCB handles in the queue, and (4) the queue pointer. Note that there should be a threading invariant expected by each thread's context when being switched to. Since this invariant is not yet known, we name it as  $a_{inv}$ . These arguments are organized properly to feed the `node.t.ref` macro. In particular, the content of a TCB (excluding the link field) must satisfy  $mth.t(st, a_{inv})$ , which is defined as:

$$\lambda mth. mth + 4 \mapsto st * mctx.t(l_{mem} * a_{inv}(mth), mth + 8)$$

In short, a TCB at  $mth$  satisfies the  $mth.t$  macro if the state  $st$  is located at offset 4 and a valid context is located at offset 8. Besides the threading invariant  $a_{inv}$  (to be provided by the current thread), the context also expects the invariant  $l_{mem}$  of the memory module.

**Threading invariant.** Figure 7.8 illustrates the run-time threading data structures and invariants. There are three entities involved: a scheduler context, a TCB of the running thread, and a ready queue, each pointed to by a global memory address. When a thread yields or exits, the scheduler context is invoked to do the scheduling.

A thread expects such an environment to be properly maintained before taking control. Meanwhile, itself is also part of such kind of environment for other threads. Therefore, the threading invariant is inherently recursive, as modelled in the following:

$$\begin{aligned}
\mathbf{I}_{\text{core}} &\triangleq \mu\alpha.\lambda st, cur. \exists sched, tl. \\
&\quad \mathbf{mctx\_sched} \mapsto sched * \mathbf{mctx\_t}(sched\_env(\alpha), sched) \\
&\quad * \mathbf{mth\_cur} \mapsto cur * cur \mapsto \_, st \\
&\quad * \mathbf{mth\_t.ref}(READY, \alpha(READY), tl, mth_rq)
\end{aligned}$$

The last three lines correspond to the three items in Figure 7.8, respectively. The recursive variable  $\alpha$  essentially represents  $\mathbf{I}_{\text{core}}$  itself, which is exactly the threading invariant that every TCB expects. Note that this invariant only describes the parts inside the dashed lines in Figure 7.8—the context of the current thread is not part of its own shared environment (nevertheless, each thread’s context considers all other threads’ contexts as part of its shared environment).

The environment of the scheduler’s context, described above using `sched_env`, deserves further explanation. We use the macro `sched_env(ainv)` to represent:

$$\begin{aligned}
\exists st, cur. \mathbf{I}_{\text{mem}} * \mathbf{mctx\_sched} \mapsto sched \\
&\quad * \mathbf{mth\_cur} \mapsto cur * cur \mapsto \_, st \\
&\quad * \mathbf{mth\_t.ref}(READY, a_{\text{inv}}(READY), tl, mth_rq) \\
&\quad * ((st = \text{READY} \wedge \mathbf{mctx\_t}(\mathbf{I}_{\text{mem}} * a_{\text{inv}}(\text{READY}, cur), cur + 8)) \vee \\
&\quad \quad (st = \text{DEAD} \wedge \exists size. cur - 4 \mapsto size * cur + 8 \mapsto [size - 8]))
\end{aligned}$$

The first three lines describe the existence of the scheduler pointer (not the scheduler context), the running thread, and the ready queue. When the scheduler takes control, there must have been a user thread that yielded. The last two lines describe the context of that yielding thread—based on whether the thread is still alive as indicated by the state field  $st$ , there would be either a proper context or the corresponding space available.

The  $\mathbf{I}_{\text{core}}$  macro is instantiated with the READY state and a running thread before used as the invariant of the threading module:

$$\mathbf{I}_{\text{full}}(cur) \triangleq \mathbf{I}_{\text{core}}(\text{READY}, cur) * cur + 8 \mapsto [32]$$

The  $cur$  pointer of the running thread is irrelevant to the clients of the threading module. The external invariant used when importing the threading module is simply

$$I_{mth} \triangleq I_{full}(\cdot)$$

where  $\cdot$  is a wildcard. We also define a syntactic sugar to describe the ready queue:

$$\text{ready\_queue}(tl, mth\_rq) \triangleq mth\_t\_ref(\text{READY}, I_{core}(\text{READY}), tl, mth\_rq).$$

**Thread yielding.** The most frequently used threading routine should be *mth\_yield()*, implemented as a context switch from the current (running) thread to the scheduler. Figure 7.9 gives its specification and proof outline. Similar to *swapcontext()*, Fn<sub>6</sub> is used here. After unpacking the threading invariant, pointers to the contexts of the current thread and the scheduler are pushed onto the stack. A call to *swapcontext()* completes the yielding. As indicated in the assertions, the thread's context expects both the memory and the threading invariants to be maintained before it takes control again. Interested readers could refer to Figure 7.4 to better understand the verification.

**Thread creation.** Dynamic thread creation is also a basic requirement for thread implementations. Just as *mth\_yield()* relies on *swapcontext()* to perform switching, *mth\_spawn()* creates a new thread with the help of *makecontext()*. When creating a new thread, one should specify the size (*stacksize*) of the thread stack, a pointer (*func*) to the thread code, and an argument pointer (*arg*) for the thread code. The data at *arg* is described by an existentially quantified predicate *a<sub>pre</sub>* that describes the private data prepared for *func()*. *mth\_spawn()* will call *malloc()* to allocate a big chunk of memory, divide it into the TCB and the stack of the new thread, initialize TCB, call *makecontext()* to build the new thread context, and call *queue\_insert()* to append the new thread onto the ready queue. The verification of *mth\_spawn()* is presented in Figure 7.10 and Figure 7.11.

The pre- and post-conditions of *mth\_spawn()* require that the memory and threading invariants be maintained, which is necessary for verifying the memory allocation and threading operations in the code. The pre- and post-conditions of *func()*, as asserted with *cptr*, also maintain the same invariants. In addition, *func()* is to be invoked with the given argument *arg*. The post-condition of *func()* involves the invariants only. This indicates

```

Fn6 { Aux: ; Local : [12];
  Pre: lmem * lmth;
  Post: lmem * lmth }

mth_yield: // void mth_yield (void);
Local: [12]; Pre: cur+8→[32]
  * lmem * mctx.sched→sched * mth.cur→cur * cur→_, READY * ready_queue(tl, mth.rq)
  * mctx.t(sched.env(lcore), sched)
    mov eax, [mctx.sched] // eax not preserved
    push eax
    mov eax, [mth.cur]
    add eax, 8
    push eax
Local: [4], cur+8, sched; Pre: cur+8→[32]
  * lmem * mctx.sched→sched * mth.cur→cur * cur→_, READY * ready_queue(tl, mth.rq)
  * mctx.t(sched.env(lcore), sched)
    // unfold, shuffle, cast
Local: [4], cur+8, sched; Pre: cur+8→[32]
  * lmem * mctx.sched→sched * mth.cur→cur * cur→_, READY * ready_queue(tl, mth.rq)
  * mctx.t(lmem * mctx.sched→sched * mth.cur→cur * cur→_, READY * ready_queue(tl, mth.rq)
    * mctx.t(lmem * lcore(READY, cur), cur+8), sched)
    call swapcontext // swapcontext(&mth.cur->mctx, mctx.sched);
Local: [4], cur+8, sched; Pre: cur+8→[32] * lmem * lcore(READY, cur)
  add esp, 8
Local: [12]; Pre: cur+8→[32] * lmem * lcore(READY, cur)
  // fold, shuffle, cast
Local: [12]; Pre: lmem * lmth
  ret

```

Figure 7.9: Verification of thread yielding

```

Fn stacksize, func, arg { Aux: ; Local: [56];
  Pre: lmem * lmth * apre ∧ stacksize ≥ 12 ∧ cptr(func, Fn arg' { Local: [stacksize-8];
    Pre: arg' = arg ∧ lmem * lmth * apre;
    Post: lmem * lmth });
  Post: lmem * lmth }

// cast (above is external interface, below is internal interface)

Fn stacksize, func, arg { Aux: tl; Local: [56];
  Pre: lmem * ready_queue(tl, mth_rq) * apre ∧ stacksize ≥ 12
    ∧ cptr(func, Fn arg' { Local: [stacksize-8];
    Pre: arg' = arg ∧ lmem * lmth * apre;
    Post: lmem * lmth });
  Post: lmem * ready_queue(tl@[retv], mth_rq) }

mth_spawn: // mth_t mth_spawn (int stacksize, void *(*func)(void *), void *arg);
Local: [56]; Pre: lmem * ready_queue(tl, mth_rq) * apre ∧ stacksize ≥ 12
  ∧ cptr(func, Fn arg' { Local: [stacksize-8];
    Pre: arg' = arg ∧ lmem * lmth * apre;
    Post: lmem * lmth });

  mov ecx, [esp+8]
  add ecx, size_of_mth_st
  push ecx // allocate a new thread control block

Local: [52], 40+stacksize; Pre: lmem * ready_queue(tl, mth_rq) * apre ∧ stacksize ≥ 12
  ∧ cptr(func, Fn arg' { Local: [stacksize-8];
    Pre: arg' = arg ∧ lmem * lmth * apre;
    Post: lmem * lmth });

  call malloc // mth_t t = (mth_t)malloc(sizeof(mth_st)+stacksize);

Local: [52], 40+stacksize; Pre: lmem * ready_queue(tl, mth_rq) * apre * t - 4 ↦ 40+stacksize
  * t ↦ [40+stacksize] ∧ eax = t ∧ stacksize ≥ 12
  ∧ cptr(func, Fn arg' { Local: [stacksize-8];
    Pre: arg' = arg ∧ lmem * lmth * apre;
    Post: lmem * lmth });

  pop ecx
  mov [eax+_state], READY // t->state = READY;
  mov edx, [esp+16]
  push edx
  mov edx, [esp+12]
  push edx
  push mth_exit
  add ecx, eax
  push ecx
  add eax, _mctx
  push eax

```

Figure 7.10: Verification of thread creation

```

Local: [36], t+8, t+40+stacksize, mth_exit, func, arg;
Pre: lmem * ready_queue(tl, mth_rq) * apre * t-4 ↪ 40+stacksize * t ↪ _, READY * t+8 ↪ [32]
    * stack(t+40+stacksize, stacksize) ∧ stacksize ≥ 12
    ∧ cptr(mth_exit, esp=sp) ∧ stack(sp, ss) * lmem * lfull(cur) * cur-4 ↪ size
        * sp ↪ [size - 40 - ss] ∧ ss=sp-curr-40)
    ∧ cptr(func, Fn arg' { Local : [stacksize-8];
        Pre: arg'=arg ∧ lmem * lmth * apre;
        Post: lmem * lmth });
call makecontext // makecontext(&t->mctx, t+sizeof(mth_st)+stacksize,
//                                         mth_exit, func, arg);

Local: [36], t+8, t+40+stacksize, mth_exit, func, arg;
Pre: lmem * ready_queue(tl, mth_rq) * t ↪ _, READY * mctx_t(lmem * lcore(READY, t), t+8)
    pop eax
    add esp, 16
    sub eax, _mctx
    push eax
    push mth_rq

Local: [48], mth_rq, t;
Pre: lmem * ready_queue(tl, mth_rq) ∧ t ↪ _ * mth.t(READY, lcore(READY), t)
    call queue_insert // queue_insert(&mth_rq, t);

Local: [48], mth_rq, t; Pre: lmem * ready_queue(tl@[t], mth_rq)
    add esp, 4
    pop eax

Local: [56]; Pre: lmem * ready_queue(tl@[t], mth_rq)
    ret

```

Figure 7.11: Verification of thread creation (continued)

that the data of  $a_{pre}$  must be properly deallocated before  $func()$  finishes, leaving no possibility of space leaks. Finally, the post-condition of  $meth\_spawn()$  does not explicitly refer to the newly spawned thread, which would be in the ready queue as part of  $l_{mth}$ .

**Thread termination.**  $meth\_exit$  is a routine that  $meth\_spawn()$  supplies as the return link for newly created contexts. It is invoked when a thread finishes execution and returns from its main function. The verification of  $meth\_exit$  is presented in Figure 7.12. The code simply sets the state of the current thread to be DEAD and transfers the control to the scheduler with  $loadcontext()$ . Since it is not a function, we can not use Fn syntax to present its pre-condition. The proof involves the common packing and unpacking of the thread-

```

mth_exit:
  esp = sp ∧ stack(sp, ss) * lmem * lfull(cur) * cur - 4 → size * sp → [size - 40 - ss]
  ∧ ss = sp - cur - 40 ∧ ss ≥ 8
  // unfold, shuffle

  esp = sp ∧ stack(sp, ss)
  * cur - 4 → size * cur + 8 → [32] * sp → [size - 40 - ss] ∧ ss = sp - cur - 40 ∧ ss ≥ 8
  * lmem * mctx_sched → sched * mth_cur → cur * cur → _, READY * ready_queue(tl, mth_rq)
  * mctx_t(sched_env(lcore), sched)
    mov eax, [mth_cur]
    mov [eax+_state], DEAD // mth_cur->state = DEAD;
    mov eax, [mctx_sched]
    push eax

  esp = sp - 4 ∧ stack(sp - 4, ss - 4, sched)
  * cur - 4 → size * cur + 8 → [32] * sp → [size - 40 - ss] ∧ ss = sp - cur - 40 ∧ ss ≥ 8
  * lmem * mctx_sched → sched * mth_cur → cur * cur → _, DEAD * ready_queue(tl, mth_rq)
  * mctx_t(sched_env(lcore), sched)
  // unfold, shuffle, cast

  esp = sp - 4 ∧ stack(sp - 4, ss - 4, sched)
  * cur - 4 → size * cur + 8 → [32] * sp → [size - 40 - ss] ∧ ss = sp - cur - 40 ∧ ss ≥ 8
  * lmem * mctx_sched → sched * mth_cur → cur * cur → _, DEAD * ready_queue(tl, mth_rq)
  * mctx_t(lmem * mctx_sched → sched * mth_cur → cur * cur → _, DEAD * ready_queue(tl, mth_rq)
    * cur - 4 → size * cur + 8 → [size - 8], sched)
  call loadcontext // loadcontext(mctx_sched);

```

Figure 7.12: Verification of thread termination

ing invariant. In addition, it combines the TCB and stack of the dead thread into one continuous memory block, which is to be freed by the scheduler.

**Threading initialization.** Multi-threading is started by calling *mth\_start()*. This function spawns the main thread, allocates the scheduler context, and establishes the global data structures used in the threading invariant of Figure 7.8. It starts thread scheduling by invoking *mth\_scheduler()*, which will return only after all threads are dead and deallocated. *mth\_start()* then deallocates the scheduler context and returns to the user program. We present the verification of *mth\_start()* in Figure 7.13.

```

Fn stacksize,main,arg { Aux: ; Local : [72];
  Pre: lmem * mctx_sched  $\mapsto$  _ * mth_cur  $\mapsto$  _ * mth_rq  $\mapsto$  _ * apre  $\wedge$  stacksize  $\geq$  12
         $\wedge$  cptr(main, Fn arg' { Local : [stacksize - 8];
  Pre: arg' = arg  $\wedge$  lmem * lmth * apre;
  Post: lmem * lmth });
  Post: lmem * mctx_sched  $\mapsto$  _ * mth_cur  $\mapsto$  _ * mth_rq  $\mapsto$  _ }

mth_start: // void mth_start (int stacksize, void *(*main)(void *), void *arg);
Local: [72]; Pre: lmem * mctx_sched  $\mapsto$  _ * mth_cur  $\mapsto$  _ * mth_rq  $\mapsto$  _ * apre  $\wedge$  stacksize  $\geq$  12
         $\wedge$  cptr(main, Fn arg' { Local : [stacksize - 8];
  Pre: arg' = arg  $\wedge$  lmem * lmth * apre;
  Post: lmem * lmth });

  mov [mth_rq], NULL // mth_rq = NULL;
Local: [72]; Pre: lmem * mctx_sched  $\mapsto$  _ * mth_cur  $\mapsto$  _ * ready_queue([],mth_rq) * apre
         $\wedge$  stacksize  $\geq$  12  $\wedge$  cptr(main, Fn arg' { Local : [stacksize - 8];
  Pre: arg' = arg  $\wedge$  lmem * lmth * apre;
  Post: lmem * lmth });

  mov eax, [esp+12] // initialize the ready thread queues
  push eax
  mov eax, [esp+8]
  push eax
  mov eax, [esp+4]
  push eax
  call mth_spawn // mth_spawn(stacksize, main, arg);
  add esp, 12 // spawn a thread for the main program

Local: [72]; Pre: lmem * mctx_sched  $\mapsto$  _ * mth_cur  $\mapsto$  _ * ready_queue([main],mth_rq)
  push size_of_mctx_st
  call malloc // mctx_sched = (mctx_t)malloc(sizeof(mctx_st));
  add esp, 4
  mov [mctx_sched], eax

Local: [72];
Pre: lmem * mctx_sched  $\mapsto$  sched * sched  $\mapsto$  [32] * mth_cur  $\mapsto$  _ * ready_queue([main],mth_rq)
  call mth_scheduler // mth_scheduler(); // do the threading

Local: [72];
Pre: lmem * mctx_sched  $\mapsto$  sched * sched  $\mapsto$  [32] * mth_cur  $\mapsto$  _ * ready_queue([],mth_rq)
  mov eax, [mctx_sched]
  push eax
  call free // free(mctx_sched);
  add esp, 4

Local: [72]; Pre: lmem * mctx_sched  $\mapsto$  sched * mth_cur  $\mapsto$  _ * ready_queue([],mth_rq)
  // cast

Local: [72]; Pre: lmem * mctx_sched  $\mapsto$  _ * mth_cur  $\mapsto$  _ * mth_rq  $\mapsto$  _
  ret

```

Figure 7.13: Verification of thread initialization

```

Fn { Aux: ; Local : [16];
  Pre: lmem*mctx_sched→sched*sched→[32]*mth_cur→_*ready_queue(tl,mth_rq);
  Post: lmem*mctx_sched→sched*sched→[32]*mth_cur→_*ready_queue([],mth_rq) }

mth_scheduler: // void mth_scheduler (void);

Local: [16];
Pre: lmem*mctx_sched→sched*sched→[32]*mth_cur→_*ready_queue(tl,mth_rq)
  push mth_rq // while (mth_cur = queue_delete(&mth_rq)) != NULL
  call queue_delete
  add esp, 4

Local: [16];
Pre: lmem*mctx_sched→sched*sched→[32]*mth_cur→_*ready_queue(tl',mth_rq)
  ∧ eax=retv ∧ ((retv=NULL ∧ tl=tl'=[]) ∨
  (retv≠NULL ∧ tl=retv::tl' ∧ retv→_*mth_t(READY,lcore(READY),retv)))
  cmp eax, NULL
  je sc_ret

Local: [16];
Pre: lmem*mctx_sched→sched*sched→[32]*mth_cur→_*ready_queue(tl',mth_rq)
  ∧ eax=retv ∧ retv≠NULL ∧ tl=retv::tl' ∧ retv→_*mth_t(READY,lcore(READY),retv)
  mov [mth_cur], eax // Found next ready thread
  add eax, _mctx
  push eax
  mov eax, [mctx_sched]
  push eax // ENTERING THREAD - by switching the machine context

Local: [8],sched,retv+8;
Pre: lmem*mctx_sched→sched*sched→[32]*mth_cur→retv*ready_queue(tl',mth_rq)
  ∧ eax=retv ∧ retv≠NULL ∧ tl=retv::tl' ∧ retv→_*mth_t(READY,lcore(READY),retv)
  // unfold, shuffle, cast, rename

Local: [8],sched,retv+8;
Pre: sched→[32]
  *lmem*mctx_sched→sched*mth_cur→cur*cur→_,READY*ready_queue(tl',mth_rq)
  *mctx_t(lmem*mctx_sched→sched*mth_cur→cur*cur→_,READY*ready_queue(tl',mth_rq))
  *mctx_t(sched_env(lcore),sched),cur+8)
  call swapcontext // swapcontext(mctx_sched, &mth_cur->mctx);

Local: [8],sched,retv+8;
Pre: sched→[32]
  *lmem*mctx_sched→sched*mth_cur→cur*cur→_,st*ready_queue(tl,mth_rq)
  *((st=READY ∧ mctx_t(lmem*lcore(READY,cur),cur+8)) ∨
  (st=DEAD ∧ ∃size.cur-4→size * cur+8→[size-8]))
  add esp, 8
  mov eax, [mth_cur] // If previous thread is now marked as dead, kick it out
  mov ecx, [eax+_state]
  cmp ecx, DEAD // if (mth_cur->state == DEAD)
  jne sc_els

```

Figure 7.14: Verification of thread scheduler

```

Local: [16];
Pre: sched  $\mapsto$  [32] * lmem * mctx_sched  $\mapsto$  sched * mth_cur  $\mapsto$  cur * ready_queue(tl, mth_rq)
       $\wedge$  st = DEAD  $\wedge$  cur  $\mapsto$  _, st * cur - 4  $\mapsto$  size * cur + 8  $\mapsto$  [size - 8]  $\wedge$  eax = cur
                           // cast

Local: [16];
Pre: lmem * mctx_sched  $\mapsto$  sched * sched  $\mapsto$  [32] * mth_cur  $\mapsto$  cur * ready_queue(tl, mth_rq)
       $\wedge$  cur - 4  $\mapsto$  size * cur  $\mapsto$  [size]  $\wedge$  eax = cur
      push eax
      call free           // free(mth_cur); // No stack from now on
      add esp, 4

Local: [16];
Pre: lmem * mctx_sched  $\mapsto$  sched * sched  $\mapsto$  [32] * mth_cur  $\mapsto$  cur * ready_queue(tl, mth_rq)
      jmp mth_scheduler

Local: [16];
Pre: sched  $\mapsto$  [32] * lmem * mctx_sched  $\mapsto$  sched * mth_cur  $\mapsto$  cur * ready_queue(tl, mth_rq)
       $\wedge$  st = READY  $\wedge$  cur  $\mapsto$  _, st * mctx_t(lmem * lcore(READY, cur), cur + 8)  $\wedge$  eax = cur
sc_els:                      // cast

Local: [16];
Pre: lmem * mctx_sched  $\mapsto$  sched * sched  $\mapsto$  [32] * mth_cur  $\mapsto$  cur * ready_queue(tl, mth_rq)
       $\wedge$  cur  $\mapsto$  _ * mth_t(READY, lcore(READY), cur)  $\wedge$  eax = cur
      push eax           // else
      push mth_rq         // insert last running thread back into this queue
      call queue_insert // queue_insert(&mth_rq, mth_cur);
      add esp, 8

Local: [16];
Pre: lmem * mctx_sched  $\mapsto$  sched * sched  $\mapsto$  [32] * mth_cur  $\mapsto$  cur * ready_queue(tl @ [cur], mth_rq)
      jmp mth_scheduler

Local: [16];
Pre: lmem * mctx_sched  $\mapsto$  sched * sched  $\mapsto$  [32] * mth_cur  $\mapsto$  _ * ready_queue(tl', mth_rq)
       $\wedge$  eax = retv  $\wedge$  retv = NULL  $\wedge$  tl = tl' = []
sc_ret:                      // cast

Local: [16];
Pre: lmem * mctx_sched  $\mapsto$  sched * sched  $\mapsto$  [32] * mth_cur  $\mapsto$  _ * ready_queue([], mth_rq)
      ret

```

Figure 7.15: Verification of thread scheduler (continued)

**Thread scheduler.**  $mh\_scheduler()$  is essentially a loop implementing FIFO scheduling. It is directly invoked only by  $mh\_start()$ . Upon entering  $mh\_scheduler()$ , the scheduler context and some other threading data structures are already in place. If the ready queue is not empty,  $mh\_scheduler()$  removes the first thread, sets it as the current thread, and switches to it using  $swapcontext()$ . The scheduler resumes execution when the current thread yields or exits, at which point the scheduler either puts the yielding thread into the ready queue or deallocates it based on the thread state. The scheduler finally returns when the ready queue becomes empty. We give the specification of  $mh\_scheduler()$  as follows, again omitting the details for space. The post-condition of  $mh\_scheduler$  preserves everything, except that there no thread in the ready queue.

We present the verification of  $mh\_scheduler()$  in Figure 7.14 and Figure 7.15. The scheduler context to be stored in the threading invariant is captured by  $swapcontext()$  when the scheduler calls it to switch to the new current thread.

## 7.4 Discussion

First, we want to emphasize that the threading module refers to other utility modules only through the interfaces. For example, the memory invariant  $I_{mem}$  is used in the threading specifications, but only propagated abstractly during the verification in the local-reasoning style of separation logic. It is therefore safe to upgrade the implementations of the utility modules without affecting the threading verification. Within the threading module, one may upgrade the FIFO scheduler without affecting other threading routines. In fact, one may even have  $mh\_start()$  be parameterized by a scheduler function with only minor changes in the proof structure.

The large and complex proof naturally raises practicality concerns. However, when being compared with other verification methods and judging practicalities, there are two aspects of our method that need to be taken into consideration.

For comparison with analysis- and test-based software verification methods in gen-

eral, the important question to ask is what kind of guarantee it can deliver. Many of these tools can automatically find bugs in millions of line of system code, but only in a “best-effort” fashion—both the programmer and the end user expect bugs to appear and patches to be released. In our case, we want to completely exclude certain categories of bugs, as guaranteed by the language-based approach used in this dissertation. So the first choice programmers should make is whether they just want to find bugs in applications or make rigorous claims and provide warranty about kernels.

For comparison with other language-based methods, such as traditional type systems, the important question to ask is what kind of target code is actually supported. Since there is no sound type systems for C and assembly, there have been many efforts on building the complete system kernels in higher-level type safe languages, all with trade-offs in efficiency and/or compatibility, some even with significant changes in programming style. In our case, we require no change to the existing system programming style and code base, thus expect no performance or compatibility issue when putting into real use. In general, we believe that low-level system code, mid-level infrastructure code, and high-level application code require different levels of safety guarantees. Thus their verifications will naturally result in different levels of productivity. In fact, the 6 person-month spent may be a fair price to pay, considering that the code is critical and heavily relied upon.

By targeting at a thread library, a piece of system code that is commonly recognized as complex and never fully verified with traditional approaches, we demonstrate the feasibility of using XCAP frameworks for the mechanical verification of realistic machine-level code. That being said, at the current stage, the abstraction level used in this dissertation is best suited for verifying critical small-scale software such as core system libraries. Much further study is needed to scale the basic framework for better productivity. One direction is to scale the verification from assembly level to C level, which should dramatically increase the productivity of verifications.

## Chapter 8

# The Coq Implementations

One key feature of the XCAP framework is mechanization. We mechanize not only our machine syntax, machine semantics, assertion languages, interpretations, inference rules, and program specifications and proofs, but also the complete meta theory of XCAP, in a general mathematic logic. Furthermore, we want to directly have the power of higher-order predicate logic, through a shallow embedding. For these reasons, our usage of the Coq proof assistant [58] is to treat it as both a mechanized logic framework and a mechanized meta logic framework, as explained in Section 2.2.

In this chapter, we first discuss our Coq implementations for the meta theory of *PropX* interpretation, for both predicative and impredicative versions. Then we discuss the implementation of XCAP inference rules and meta theory. Based on that, we present the difference with the implementation of XCAP86. Finally, we discuss our implementation for the certified mini thread library in the previous chapter.

We point out that although our implementation and presentation in this dissertation use CiC/Coq as the mechanized meta logic, in theory it is possible to implement XCAP upon other mechanized meta logics.

## 8.1 Implementation of $PropX$

For the extended propositions in Section 3.2, we define it as the following.

```
Inductive PropX : Type
  := cptr: Word -> (State -> PropX) -> PropX
  | prop: Prop -> PropX
  | andx: PropX -> PropX -> PropX
  | orx : PropX -> PropX -> PropX
  | impx: PropX -> PropX -> PropX
  | allx: forall A, (A -> PropX) -> PropX
  | extx: forall A, (A -> PropX) -> PropX.
```

The encoding uses higher-order abstract syntax (HOAS) [52] to represent extended predicates. Interpretation of extended propositions is defined as a recursive function.

```
Definition Assertion := State -> PropX.

Definition CdHpSpec := Map Label Assertion.

Fixpoint Itp (P : PropX) (Si : CdHpSpec) {struct P} : Prop
  := match P with
    | cptr l a => lookup Si l a
    | prop p => p
    | andx P Q => Itp P Si /\ Itp Q Si
    | orx P Q => Itp P Si /\ Itp Q Si
    | impx P Q => Itp P Si -> Itp Q Si
    | allx A P => forall x, Itp (P x) Si
    | extx A P => exists x, Itp (P x) Si
  end.
```

And the interpretation of assertions is a simple lift.

```
Definition ItP a Si S := Itp (a S) Si.
```

For the XCAP with impredicative polymorphism defined in Section 4.1 and Section 4.3 the HOAS encoding of extended propositions no longer works. The positivity requirement in Coq inductive definition limits the type  $A$  of the quantified terms to be of lower level than  $PropX$ , which can not be used for impredicative quantifications. We use de Bruijn notations [15] to encode them, but keep using HOAS for all other constructors.

```

Inductive PropX : list Type -> Type :=
| var : forall L A, A
| lift: forall L A, PropX L
| cptr: forall L, Word -> (State -> PropX L)
| ref : forall L, Word -> (Word -> PropX L)
| prop: forall L, Prop
| andx: forall L, PropX L -> PropX L
| orx : forall L, PropX L -> PropX L
| impx: forall L, PropX L -> PropX L
| allx: forall L A, (A -> PropX L)
| extx: forall L A, (A -> PropX L)
| allv: forall L A, PropX (L ++ A :: nil)
| extv: forall L A, PropX (L ++ A :: nil)
| mu : forall L A, (A -> PropX (L ++ A :: nil)) -> A -> PropX L.

```

We can also define some syntactic sugars for *PropX*.

```

Notation "<< p >>" := (prop _ p ) (at level 40).
Notation "P ./\ Q" := (andx _ P Q) (at level 51, right associativity).
Notation "P .\ / Q" := (orx _ P Q) (at level 70, right associativity).
Notation "P .-> Q" := (impx _ P Q) (at level 100, right associativity).

```

The following auxiliary definitions are straight-forward.

```

Definition Assertion := State -> PropX nil.

Definition WordTy     := Word -> PropX nil.

Definition CdHpSpec := Map Label Assertion.

Definition DtHpSpec := Map Label WordTy.

Definition Env        := list (PropX nil).

```

When weak update reference is supported, the interpretation of XCAP assertions can be defined as the following.

```

Definition Itp P Si Fi := OK nil Si Fi P.

Definition DH Si Fi H :=
  forall l t, lookup Fi l t -> exists w, lookup H l w /\ Itp (t w) Si Fi.

Definition ItP a Si S :=
  match S with (H, R) =>
  exists Fi, exists H1, exists H2,
  disjoint H1 H2 /\ merge H1 H2 = H /\ Itp (a (H1,R)) Si Fi /\ DH Si Fi H2
end.

Notation "p ==> q"      := (forall Si S, ItP p Si S -> ItP q Si S)
                                (at level 130, right associativity).

```

*Itp* is the interpretation of extended propositions, defined using the *PropX* validity below. *ItP* is the assertion interpretation, as discussed in Section 5.1.

```

Inductive OK : Env -> CdHpSpec -> DtHpSpec -> PropX nil -> Prop :=
| ok_env      : forall E Si Fi p,   In p E           -> OK E Si Fi p
| ok_cptr_i   : forall E Si Fi l P,  lookup Si l P   -> OK E Si Fi (cptr nil l P)
| ok_cptr_e   : forall E Si Fi l P q,
              OK E Si Fi (cptr nil l P) ->
              (lookup Si l P -> OK E Si Fi q)  -> OK E Si Fi q
| ok_ref_i    : forall E Si Fi l P,  lookup Fi l P   -> OK E Si Fi (ref nil l P)
| ok_ref_e    : forall E Si Fi l P q,
              OK E Si Fi (ref nil l P) ->
              (lookup Fi l P -> OK E Si Fi q)  -> OK E Si Fi q
| ok_prop_i   : forall E Si Fi (p : Prop), p         -> OK E Si Fi (<< p >>)
| ok_prop_e   : forall E Si Fi (p : Prop) q,
              OK E Si Fi (<< p >>) ->
              (p -> OK E Si Fi q)           -> OK E Si Fi q
| ok_and_i    : forall E Si Fi p q, OK E Si Fi p ->
              OK E Si Fi q           -> OK E Si Fi (p ./\ q)
| ok_and_e1   : forall E Si Fi p q,
              OK E Si Fi (p ./\ q)        -> OK E Si Fi p
| ok_and_e2   : forall E Si Fi p q,
              OK E Si Fi (p ./\ q)        -> OK E Si Fi q
| ok_or_i1   : forall E Si Fi p q, OK E Si Fi p ->
              OK E Si Fi q           -> OK E Si Fi (p .\vee q)
| ok_or_i2   : forall E Si Fi p q, OK E Si Fi q ->
              OK E Si Fi q           -> OK E Si Fi (p .\vee q)
| ok_or_e    : forall E Si Fi p q r,
              OK E Si Fi (p .\vee q) ->
              OK (cons p E) Si Fi r ->
              OK (cons q E) Si Fi r   -> OK E Si Fi r
| ok_imp_i    : forall E Si Fi p q,
              OK (cons p E) Si Fi q   -> OK E Si Fi (p .-> q)
| ok_imp_e    : forall E Si Fi p q,
              OK E Si Fi p ->
              OK E Si Fi (p .-> q)   -> OK E Si Fi q
| ok_allx_i   : forall E Si Fi A
              (P : A -> PropX nil),
              (forall x, OK E Si Fi (P x)) -> OK E Si Fi (allx nil A P)
| ok_allx_e   : forall E Si Fi A P,
              OK E Si Fi (allx nil A P) ->
              forall B : A,                  OK E Si Fi (P B)
| ok_extx_i   : forall E Si Fi A
              (P : A -> PropX nil) x,
              OK E Si Fi (P x)         -> OK E Si Fi (extx nil A P)
| ok_extx_e   : forall E Si Fi A P q,
              OK E Si Fi (extx nil A P) ->
              (forall B : A,
              OK (cons (P B) E) Si Fi q) -> OK E Si Fi q
| ok_allv_i   : forall E Si Fi A p,
              (forall x : A -> PropX nil,
              OK E Si Fi (Subst nil A p x)) -> OK E Si Fi (allv nil A p)
| ok_extv_i   : forall E Si Fi A p
              (q: A -> PropX nil),
              OK E Si Fi (Subst nil A p q) -> OK E Si Fi (extv nil A p)
| ok_mu_i     : forall E Si Fi A p (v:A),
              OK E Si Fi (Subst nil A (p v)
              (mu nil A p)) -> OK E Si Fi (mu nil A p v).

```

Soundness of *PropX* validity rules is implemented as the following lemma. It takes around 4,000 lines of Coq tactics to prove this theorem.

```

Lemma Validity_Soundness :
  forall L Si Fi (p : PropX L),
  match L as t return forall p : PropX t, Prop with
  nil => fun p => OK nil Si Fi p ->
    match p in PropX t return t = nil -> Prop with
    | var _ _ _ => fun pf => False
    | lift _ _ _ => fun pf => False
    | cptr _ 1 P => fun pf => lookup Si 1 (fun S => eq_rect _ _ (P S) _ pf)
    | ref _ 1 P => fun pf => lookup Fi 1 (fun S => eq_rect _ _ (P S) _ pf)
    | prop _ p => fun pf => p
    | andx _ p q => fun pf => OK nil Si Fi (eq_rect _ _ p _ pf) /\ 
                      OK nil Si Fi (eq_rect _ _ q _ pf)
    | orx _ p q => fun pf => OK nil Si Fi (eq_rect _ _ p _ pf) \/
                      OK nil Si Fi (eq_rect _ _ q _ pf)
    | impx _ p q => fun pf => OK nil Si Fi (eq_rect _ _ p _ pf) ->
                      OK nil Si Fi (eq_rect _ _ q _ pf)
    | allx _ A P => fun pf => forall x, OK nil Si Fi (eq_rect _ _ (P x) _ pf)
    | extx _ A P => fun pf => exists x, OK nil Si Fi (eq_rect _ _ (P x) _ pf)
    | allv _ A p => fun pf => forall x, OK nil Si Fi
      (Subst _ _ (eq_rect _ _ p _ (trans_eq (eq_app_eq _ _ _ pf)
                                         (sym_eq (nil_app_eq _)))))

      x)
    | extv _ A p => fun pf => exists x, OK nil Si Fi
      (Subst _ _ (eq_rect _ _ p _ (trans_eq (eq_app_eq _ _ _ pf)
                                         (sym_eq (nil_app_eq _)))))

      x)
    | mu _ _ P v => fun pf => OK nil Si Fi
      (Subst _ _ (eq_rect _ _ (P v) _ (trans_eq (eq_app_eq _ _ _ pf)
                                         (sym_eq (nil_app_eq _)))))

      (mu nil _ (fun v => (eq_rect _ _ (P v)
                               - (eq_app_eq _ _ _ pf)))))

      end (refl_equal _)
    | _ => fun _ => True
  end p.

```

To allow more *Prop*-like handling of proofs, we build many tactics for *PropX* reasoning. Below are a selection of them.

```

Definition okvalid := Validity_Soundness nil.

Ltac des H      :=  

  generalize (okvalid _ _ H); clear H; intro; unfold eq_rect in * |-.  

  

Ltac dest H      :=  

  generalize (okvalid _ _ H); clear H; destruct 1; unfold eq_rect in * |-.  

  

Ltac destx H v :=  

  generalize (okvalid _ _ H); clear H; destruct 1 as [v]; unfold eq_rect in * |-.  

  

Ltac splitx      := apply ok_and_i.  

  

Ltac leftx       := apply ok_or_i1.  

  

Ltac rightx      := apply ok_or_i2.  

  

Ltac introx      := apply ok_allx_i.  

  

Ltac existsx x   := apply ok_extx_i with x.  

  

Ltac introv      := apply ok_allv_i.  

  

Ltac existsv a   := apply ok_extv_i with a.  

  

Ltac propx       := apply ok_prop_i.  

  

Ltac cptrx       := apply ok_cptr_i.

```

## 8.2 Implementation of XCAP

We implemented XCAP inference rules as the following inductive definitions.

```

Inductive WFiseq : CdHpSpec -> Assertion -> InstrSeq -> Prop :=
| wfiseq : forall Si a c I a',
  (forall Si s, ItP a Si s ->
   exists s',
   Next c s s' /\ ItP a' Si s') ->
   WFiseq Si a' I                                     -> WFiseq Si a (iseq c I)
| wfbgti : forall Si a rs w l I a' a'',
  ((fun s => << _R s rs <= w >> ./\ a s) ==> a'') ->
  ((fun s => << _R s rs > w >> ./\ a s) ==> a' ) ->
  lookup Si l a' ->
  WFiseq Si a'' I                                     -> WFiseq Si a (bgti rs w l I)
| wfjfd   : forall Si a l a',
  lookup Si l a' ->
  (a ==> a')                                         -> WFiseq Si a (jd l)
| wfjmp   : forall Si a r,
  (a ==> fun S => extv _ _
   (eq_rect _ _ (cptr _ (_R S r) (var _ _)
                      ./\ var _ _ S)
              _ (nil_app_eq _)))                         -> WFiseq Si a (jmp r)
| wfecp   : forall Si a l a' a'' I,
  WFiseq Si a' I ->
  lookup Si l a'' ->
  ((fun s => cptr _ l a' ./\ a s) ==> a')          -> WFiseq Si a I.

```

```

Inductive WFcode : CdHpSpec -> CodeHeap -> CdHpSpec -> Prop :=
| wfcdhp : forall Si Si' C,
  (forall l a, lookup Si' l a ->
   exists I, lookup C l I /\ WFiseq Si a I)      -> WFcode Si C Si'
| wflink : forall Si1 Si2 Si'1 Si'2 C1 C2,
  WFcode Si1 C1 Si'1 ->
  WFcode Si2 C2 Si'2 ->
  Map.disjoint C1 C2 ->
  subseteq Si1 (merge Si1 Si2) ->
  subseteq Si2 (merge Si1 Si2)
  -> WFcode (merge Si1 Si2) (merge C1 C2) (merge Si'1 Si'2).

```

```
Definition WFprogram Si a P := match P with (c, (s, i)) =>
  WFcode Si c Si /\ ItP a Si s /\ WFiseq Si a i
  end.
```

Selected soundness theorem and lemmas of XCAP are presented as follows. It takes around 700 Coq tactics to prove these theorem. This confirms that XCAP is a lightweight framework.

```

Lemma InstrSeqWeakening :
  forall Si Si' a a' I,
    WFiseq Si a' I -> subsequeq Si Si' -> (a ==> a') -> WFiseq Si' a I.

Lemma CodeHeapTyping :
  forall C Si l a,
    WFcode Si C Si -> lookup Si l a -> exists I, lookup C l I /\ WFiseq Si a I.

Theorem Soundness :
  forall Si a P, WFprogram Si a P ->
  exists a', exists P', STEP P P' /\ WFprogram Si a' P'.

```

### 8.3 Implementation of XCAP86

XCAP86's machine model, Mini86, is quite different from TM used in XCAP. Its Coq implementation is complicated by new details such as byte-addressed word-aligned memory, fixed machine word size, and machine instruction decoding. For example, the following are the macros to test if a byte or word is stored in the memory. (To keep our machine model close to TM, the memory is modeled as a mapping from aligned addresses to machine words.)

```

Definition byte h l :=
  match (Z_eq_dec (l mod 4) 0) with
  | left _ => match (h l) with
    | Some w => Some (w mod 256)
    | None => None
    end
  | _       => match (Z_eq_dec (l mod 4) 1) with
    | left _ => match (h (l-1)) with
      | Some w => Some ((w / 256) mod 256)
      | None => None
      end
    | _       => match (Z_eq_dec (l mod 4) 2) with
      | left _ => match (h (l-2)) with
        | Some w => Some ((w / 65536) mod 256)
        | None => None
        end
      | _       => match (h (l-3)) with
        | Some w => Some ((w / 65536) / 256)
        | None => None
        end
      end
    end
  end
end.

```

```

Definition dword h l w := byte h l      = Some (w mod 256)
          /\ byte h (l+1) = Some (w / 256 mod 256)
          /\ byte h (l+2) = Some (w / 65536 mod 256)
          /\ byte h (l+3) = Some (w / 65536 / 256).

```

We carefully designed Mini86 and its implementation so that they stay as close to TM as possible. For example, below are selected implementations details of Mini86, which look similar to TM's in Section 2.2.

```

Inductive Next : Instr -> State -> State -> Prop :=
| stp_add  : forall r o H R F R' F',
  Dword (R r + Ro R o) ->
  uR R R' r (R r + Ro R o) ->
  CalcF F' (R r + Ro R o) ->
  Next (add r o) (H, R, F) (H, R', F')
| stp_mov   : forall r o H R F R',
  uR R R' r (Ro R o) ->
  Next (mov r o) (H, R, F) (H, R', F)
| stp_movrm : forall r d H R F w R',
  lookup H (Ra R d) w ->
  uR R R' r w ->
  Next (movrm r d) (H, R, F) (H, R', F)
| .....

Inductive STEP : Program -> Program -> Prop :=
| stp_iseq : forall H R F H' R' F' pc c npc,
  Dc H pc c npc ->
  Next c (H, R, F) (H', R', F') ->
  STEP ((H, R, F), pc) ((H', R', F'), npc)
| stp_jmp  : forall o H R F pc npc,
  Dc H pc (jmp o) npc ->
  STEP ((H, R, F), pc) ((H, R, F), Ro R o)
| .....

```

Other than the difference in machine model, XCAP86 also mainly differs in instruction-level inference rules.

```

Inductive WFiseq : CdHpSpec -> Assertion -> InstrSeq -> Prop :=
| wfiseq : forall Si a c I a',
  (a ==> (fun s => Ex s' .
    <<Next c s s'>> ./\ a' s')) ->
  WFiseq Si a' I
  -> WFiseq Si a (iseq c I)
| wfjcc : forall Si a cc f I a' a'',
  ((fun s => <<~(Fcc (_F s) cc)>> ./\ a s) ==> a'') ->
  ((fun s => <<Fcc (_F s) cc>> ./\ a s) ==> a') ->
  lookup Si f a' ->
  WFiseq Si a' I
  -> WFiseq Si a (iseq (jcc cc f) I)
| wfjmpi : forall Si a f a',
  lookup Si f a' ->
  (a ==> a')
  -> WFiseq Si a (instr (jmp (word f)))
| wfjmpi : forall Si a r,
  (a ==> fun S => extv _ _
    (eq_rect _ _ (cptr _ (_R S r) (var _ _)
      ./\ var _ _ S)
    _ (nil_app_eq _)))
  -> WFiseq Si a (instr (jmp (reg r)))
| wfcallli : forall Si a f fret a',
  lookup Si f a' ->
  (a ==> (fun s => Ex s' .
    <<Next (push (word fret)) s s'>>
    ./\ a' s')) ->
  WFiseq Si a (instr' (call (word f)) fret)
| wfcalllr : forall Si a r fret,
  (a ==> (fun S => extv _ _
    (eq_rect _ _ (cptr _ (_R S r) (var _ _)
      ./\ Ex s'. <<Next (push (word fret)) S s'>>
      ./\ var _ _ s')
    _ (nil_app_eq _)))
  -> WFiseq Si a (instr' (call (reg r)) fret)
| wfret : forall Si a,
  (a ==> (fun S => Ex fret.
    <<lookup (_H S) (_R S esp) fret>>
    ./\ extv _ _
    (eq_rect _ _ (cptr _ fret (var _ _)
      ./\ Ex s'. <<Next pop' S s'>>
      ./\ var _ _ s')
    _ (nil_app_eq _)))
  -> WFiseq Si a (instr ret)
| wfcp : forall Si a f a' a'' I,
  WFiseq Si a' I ->
  lookup Si f a'' ->
  ((fun s => cptr _ f a' ./\ a s) ==> a')
  -> WFiseq Si a I.

```

The implementation of Mini86 takes about 400 lines of Coq code. The implementation of XCAP86 meta theory takes about 1,000 lines of Coq code.

## 8.4 Implementation of MTH

Every MTH module is specified and verified separately from its clients. During the certification of MTH, we collected a large number of lemmas, including some on generic separation-logic reasoning and others on common code patterns. Our Coq implementation consists of approximately 34,000 lines of Coq code (about 5,000 lines are common lemmas, 3,000 lines are for the queue module, 8,000 lines are for the machine context module, and 18,000 lines are for the threading module). The full compilation of XCAP86 and MTH implementation in Coq (proof-script checking and proof-binary generation) takes about 4 hours on an Intel Pentium M 1.6Ghz CPU with 2GB memory.

The development of the code took nearly six person-month. There are several reasons for the large proof size. First of all, this is the first time we are doing this kind of realistic proof, so a lot of infrastructural code and experience need to be developed and learned. For example, the first procedure we certified, *swapcontext()*, took one person-month and 5,000 lines of Coq code, even though it consists of only 19 lines of assembly code. Second, the relatively low level of proof reuse caused much redundancy. The third reason is the complexity of x86 machine, where features such as finite integer and byte-addressed word-aligned memory are not support very well in Coq yet. Nevertheless, the biggest reason, we believe, is the complexity of the actual machine code, which is obvious based on the discussions in the previous chapter.

For example, the machine context data type in Section 7.2 is implemented as the following, which is more complex than what its meta presentation looks like.

```

Definition mctx_t L aenv mctx : Heap -> PropX L :=
fun H =>
Ex retv, bx, cx, dx, si, di, bp, sp, ret.
  star (ptolist _ mctx (retv::bx::cx::dx::si::di::bp::sp::nil))
  (star (pto _ sp ret)
    (fun H => extv _ _ (eq_rect _ _
      (Lift _ _ (var t0 _ H))
      ./\ codeptr _ ret
      (fun S' => match S' with ((H',R'),F') in
        reg6 _ bx cx dx si di bp (sp+4) R' ./\ << R' eax = retv>>
        ./\ star (fun H => Shift _ _
          (eq_rect _ _ (aenv H) _ (app_nil_eq _)) _)
        (star (ptolist _ mctx (retv::bx::cx::dx::si::di::bp::sp::nil))
        (star (pto _ sp ret)
          (fun H => Lift _ _ (var t0 _ H))))
        H'
        end))
      _ (nil_app_eq _))))
  H.

```

On the other hand, since we have used a lot of abstraction and composition in the verification and implementation, we are still able to achieve very abstract and modular specifications and reasoning, such as the thread yielding function interface shown below.

```

Definition Amth_yield : Assertion := Fn6 {Aux : ; Local : [12] ;
  Pre: star (Imem _) (Imth _);
  Post: star (Imem _) (Imth _)}.

```

# Chapter 9

## Conclusion and Future Work

### 9.1 Conclusion

The formal establishment of low-level system code safety poses great challenges to existing language-based security methods. In particular, generating proof-carrying code for realistic machine binary requires a verification framework that is both expressive and modular. Previously, neither traditional type systems nor Hoare-logic systems can achieve both of these goals without comprise. Type systems in general lacks support of general logic predicate, while ordinary logic assertions can not modularly talk about higher-order programming concepts such as embedded code pointers.

In this dissertation, we propose a hybrid approach to achieve modular and expressive machine code verification. By combining the general logic and semantic subsumptions with syntactic type-like constructs and inference rules, our new framework, XCAP, can be used to write program specification in arbitrary logical predicates, in which higher-order features such as embedded code pointers, impredicative polymorphisms, recursive specifications, and weak update references can be expressed as primitive propositions. Thus, XCAP achieves the expressive power of logic-based approaches and the modularity of type-based approaches.

XCAP is not an isolated result. It can be used as a target of existing certifying com-

piler, since there is a straight-forward type-preserving translation from a typed assembly language to XCAP. The interaction of TAL and XCAP code is very flexible, as the translation is essentially a shallow embedding of TAL in XCAP’s assertion language, *PropX*. For application and system code written in type-safe languages, XCAP is a good platform for these code to interoperate with each other, as well as other certified system kernel module.

The most important goal for XCAP is to support direct verification of realistic system kernel assembly code with the help of an interactive proof assistant. XCAP is ported to XCAP86, with a realistic machine model following the x86 architecture. We show how to use XCAP86 to certify a mini thread library, which could serve as a basis for realistic certified system kernel. Such kind of code is not certifiable with traditional type safe languages, thus our verification is the first of this kind.

One key aspect of the XCAP framework is mechanization. Not only are the machine model, specification languages, and proof of assembly programs fully mechanized in the Coq proof assistant, but the entire meta theory of XCAP and its various extensions are fully implemented in Coq as well. In the end, the only things that need to be trusted by the programmers and users are merely the machine model, mathematical logic, and a tiny proof-checker.

In summary, XCAP is a simple, general, expressive, and modular framework for verification of realistic machine code found in both application and system programs.

## 9.2 Trusted Computing Base

Given the mixed presentation of the theoretical framework and mechanized implementation in the previous chapters, it may not be obvious what exactly are trusted and untrusted for XCAP and its applications. In this section we discuss the trusted computing base (TCB) from the point of view of a programmer.

**Trusted: meta logic.** To do any formal reasoning, the programmer has to choose a (mechanized) meta logic, and trust it with respect to his intuition, *i.e.*, agree with the representations and reasoning in it. In our case, the variant of Calculus of Inductive Constructions (CiC), *a.k.a.*, higher-order predicate logic with inductive definitions, is the mechanized meta logic theory that needs to be trusted. The programmer should treat this logic as a consistent one. Yet its consistency is not provable inside itself. In practice, it is usually not a big concern, as the meta theory of the logic has been published and can be examined by any logicians.

**Trusted: machine model.** First of all, the programmer has to trust the mathematical modeling of the actual hardware, namely, the CPU and the memory. The machine representation on paper and encoding in Coq are merely symbols, while the actual computation is a physical process. Without including formal hardware verification results, it is impossible to prove the correspondence between the symbols and the hardware state. In practice, it is usually not a concern though, as the hardware specifications are publically available and considered correct in general.

**Trusted: proof checker.** In terms of mechanization, the programmer needs to have a proof checker to mechanically check the validity of proofs with respect to specifications. The proof checker has to be trusted unless it is verified again using some other formal method (which again will have the question of TCB). Note that the proof checker is not equal to the entire Coq proof assistant; only a small part of Coq does the job of proof checking. The majority of Coq source code, such as those dealing with proof searching, do not need to be trusted. This is similar to the “certifying compiler” concept. Although we do not have a stand-alone proof checker for CiC, building one should not take more than a few thousand lines of code. One easy way to raise assurance of the proof checker is to independently develop multiple checkers. This way, even if each checker has bugs that occur 20% of the time, the chance of a faulty proof passing three checkers is less than 1%.

**Untrusted: PropX and XCAP theory.** Since the entire PropX and XCAP meta theory have been mechanized and proved consistent inside Coq, they do not need to be trusted at all. As shown by the various expansions done in this dissertation, as long as the expanded framework is proved sound in Coq, these extensions are safe.

**Trusted: program specifications and safety polices.** Whether the program specifications and safety polices are trusted or not is perhaps the most confusing question. The programmer has to realize that, in the end, he has to trust that the program specifications do reflect what is expected of the program's behavior. However, for traditional type systems, one often has to ask “what kind of properties are actually supported”. This is because there is a gap between program specifications—types—and the meta logical safety polices. In other words, what a programmer sees (types) is not exactly what he gets (safety policy). He has to trust two things: (1) the program specification reflects the safety policy; and (2) the safety policy is what he wants.

For XCAP, however, we support WYSWYG (What You See is What You Get), as the program specifications are written in logic formulas. The code heap specification contains the safety polices at each key program point. (Strictly speaking, XCAP specifications are not meta logic formulas, as they may contain PropX syntactic constructs such as `cptr`; however, the soundness theorem of PropX interpretation guarantees that PropX level constructors and quantifiers can be treated as meta logic ones.) Thus it is meaningless to ask “what kind of properties are actually supported”, as all possible state-based safety policies definable in CiC are supported in XCAP. When we refer to an XCAP program specification as “trusted”, we only mean that we believe the meta logic safety policy (same as the program specification) is what we really want.

**Untrusted: safety proof.** Once the program specification is set, the safety proof can be constructed through many different ways, such as theorem provers, interactive proof assistants, and certifying compilers. None of these details matters, though, as the pro-

grammer does not need to trust these proof—either they are correct and accepted by the checker, or they are incorrect and rejected by the checker.

Based on the above discussions, it is clear that the trusted computing base for XCAP is really small. In the author’s opinion, the only further reducible part of the TCB is the proof checker, which can always be replaced by smaller and smarter implementations, so that they can be verified by hand.

### 9.3 Comparison with the Indexed Approach

One line of work closely related to the approach in this dissertation is the “index-based semantic model” approach [6, 18, 3, 8, 9, 2, 56, 10]. In this section we refer to that approach as the indexed approach.

The indexed approach and XCAP share a lot of similarities as both of them belong to the foundational proof-carrying code architecture. Both are based on mechanized meta logics. While XCAP uses higher-order predicate logic with inductive definitions, the indexed approach has used the same higher-order predicate logic as well as another higher-order logic. Both want to produce foundational proof for well-typed programs and face the same problem of how to modularly express higher-order type-oriented features such as embedded code pointers, general references, impredicative polymorphisms, and recursive types in the meta logic.

The theoretical difference between the indexed approach and the XCAP approach is mainly on whether to use an extra natural number as “index” in “world” and judgments, or to use an extra thin layer of syntax and interpretation to solve the above problem. In the most recent version of the indexed approach, Appel *et al* [10] proposed a new modal model for expressing recursive and impredicative quantified types with mutable reference. Their method uses a Kripke semantics of the Godel-Lob logic of provability. To support mutable reference, they include the memory mapping information in the “world”.

At the top level, the difference seems to be rather big considering the similarity be-

tween the two approaches. There are several reasons for the different solutions the two approaches have reached as of today.

For the indexed approach, they initially chose to work upon a logic framework (LF), formalize the meta logic (HOL) in it, and build semantic models for types in HOL. This approach turned out to be quite heavyweight, as one has to first build an entire mechanized meta logic through deep embedding. Even that, since there is no built-in inductive definition in the meta logic, one has to use Gödel numbers to simulate simple inductive definitions. While for the CAP/XCAP approach, the work has been based on CiC / Coq, which act as both a logic framework and a meta logic framework (through shallow embedding), and have built-in support of inductive definitions. This allows light-weight implementation, as shown by the hundreds of lines of code for CAP and the thousands of lines of code for PropX and XCAP. Later on, the indexed approach has switched to use CiC/Coq and can now offload the burden of building meta logics and simulating inductive definitions. Since this switch is relatively new, it is still too early to make detailed comparison between the implementations of the latest indexed approach and XCAP.

One major reason is the different guarantee of these two approaches. For XCAP, it is designed to support more than "type safety" at each program point. The code heap specification specifies the safety policy in general logic predicates (safety policy), and is completely customizable. Meta theory of XCAP guarantees that these safety policies are actually observable. While for the indexed approach, the definition of "codeptr" has always been based on a non-stuckness safety (when the program jumps to an indirect address, the meta theory can only guarantee that the future execution of the program will never get stuck). Thus, there the safety guarantee is more like the traditional notion of type safety and memory safety. It remains unclear how to change the definition of "codeptr" in order to support customizable and observable safety policies without rebuilding the meta theory of the indexed approach.

Another reason is the difference between the verification targets of these approaches. XCAP is designed to support direct verification of assembly programs with non-trivial

properties not expressible in traditional types. Besides the examples discussed in this dissertation, various examples of CAP/XCAP programs have been mechanically certified [62, 64, 20, 21, 19]. While the indexed approach has been focusing on type-preserving compilations from high-level languages, thus having few applications of direct verification at assembly level now. It is important to note that the XCAP approach also fits into the type-preserving compilation process, as shown by the previous chapters.

We believe that both the indexed approach and the XCAP approach have a long way to go before reaching a fully practical FPCC framework. There are recent trends which show the possibility of these two approaches meeting and merging in the future development.

## 9.4 Future Work

Given the comprehensive nature of the XCAP framework, there can be many possible future work on type theory, logic theory, verification framework and tools, certifying compilers, safe languages, certified software components, and software security.

**General support of higher-order logic specifications** One observation from the hybrid type/logic approach being used in XCAP is that, although it currently requires two different forms of syntactic higher-order specifications, for code pointer and data pointer, respectively, there is great similarity in their formation, interpretation, and usage. One idea is to investigate if there is any profound relationship between them, and hopefully to find out a more general way to describe higher-order logic specifications. The intuition is to build a *PropX* level facility whose role and functionality is similar to the inductive definition facility in *Prop*. However, that will raise the question of what exactly the additional layer of syntax does, which we do not have a crystal clear understanding yet. The result here could be very useful for specification and verification of synchronization primitives and garbage collection.

**Integrating logic specifications into type systems** A different angle to view the framework is to keep existing type systems and add logic-based specifications into them for the additional expressive power. There are already some recent works along this line. However, the goal here is not to define one or a few particular logic-enriched type systems, but to develop a systematic method to inject (partial) logic-power into existing type systems. Hopefully, the effort that is needed to enhance existing typed codes safety guarantee will then be much less.

**Infrastructure support for logic-based verification** This part includes the selection, evaluation, and potentially development of mathematical logic, binary format, concrete syntax, proof checker, theorem prover, proof assistant, as well as the optimization and delivery of proof. Another important long-term effort will be on building up proof libraries and macros that can be reused. This includes machine architecture, mathematic theory, type theory, logic theory, proof-searching tools, proof-abstraction tools, etc.

**Compare various works on type/logic theory** The integration of type and logic power into a formal system has been an active area in the past few years. Other than the works on CAP/XCAP, there are works such as Princeton's indexed model for types, Harvard's hybrid type/logic system, etc. Investigating these works and conducting an comprehensive comparison of them can be useful and might yield interesting results.

**General memory management** Memory management has been active field in the past decades. In type system world, there are regions, linear types, stack types, *etc.*; while in the logic system world, there are separation logic, BI logic, state logic, CAP/XCAP, etc. Apart from the specific research on certifying garbage collection, it will be interesting to explore the more general memory management models in the logic-based context. For example, it might be eventually possible to let users mix the usage of “managed” and “unmanaged” data pointers and even do intensional analysis of the kind of data pointers.

**C-level logic-based language** To write system components in a safe language, assembly level is apparently too low, while high level languages might be a bit too high for some cases, especially for legacy systems. The major technical gap between assembly and C lies in: 1) C stack abstraction; 2) register allocation; 3) specification of complex control flow; and 4) preservation after compiler optimization. As the first step, 4) could be ignored or naively handled, since it belongs to the more general question of the interaction between logic-based specification and type-based compilation.

**Dynamic code generation / loading / linking** As a common runtime feature, dynamic code generation, loading, and linking are difficult to establish their safety, which are crucial for extensible OS and other software. Dynamic linking for type-safe code has been relatively clear by now. However, with components such as runtime being certified using logic-based approach, how safe linking can be automatically and dynamically carried out is still unclear. The most extreme case will be self-modifying code.

**Binary compatibility for certified and legacy code** One possible and interesting question to ask is whether both certified and legacy code can be compatible with each other. There are two scenarios. The first is to have certified component being deployed in a legacy system, knowing that internally the component should not contain any bug, to increase the overall dependability of systems, as well as to help debugging other legacy components. This is relatively easy to do. The second scenario is to have certified runtime create protected virtual machine to load and run uncertified legacy code, and enforce strict check upon its interface, thus achieving a safe and legacy-friendly system. Communications between certified and legacy code in such systems will be interesting, too.

**BIOS / firmware / boot loader** No matter how the OS and software stack are verified, the layer between software and the actual “safe” hardware, however thin it is, is still a major venerable part of the entire system. Given the ability to update firmware in most of

the modern computing devices, it is difficult to maintain the system integrity. Very likely, emerging security features such as TPM and DRM will also require safety guarantees of firmware to prevent them from being compromised by all possible software-based attacks. These code are suitable for the low-level logic-based verification. Also, there are many interesting non-safety properties for these code. For example, for a firmware, one of the most important features is that it will allow future update what so ever. It is interesting to certify this particular property in the case of power loss during an update.

**Preemptive threading (interrupts, I/O, etc.)** In the work on verifying threading in this dissertation, as there are no interrupts, only co-operative non-preemptive scheduling is supported. One argument is that preemptive scheduling can be treated as a special case of non-preemptive threading, by inserting a virtual yield instruction between every instruction that has the interrupt bit enabled. It will be interesting to see what kind of re-structuring is necessary to make the specification and proof reasonably simple. I/O will be then meaningful to support with the presence of hardware interrupts.

**Device driver** Following up work will naturally be device drivers, which currently are either unsafe (Windows) or restricted (Singularity). The vision is that, even for device drivers as complex as self-modifying ones, they can indeed be certified without any compromise. Properties such as temporal ones might need to be introduced here if correctness is a concern. Domain-specific type/logic languages might be defined to utilize the commonality of device drivers. Partial correctness and simple liveness properties might need insights from existing work on device-driver verification, such as model checking.

**Garbage collectors** It is expected that verification of partial-correctness of a garbage collection process requires more than the plain type safety provided by typical typed languages. Under the hood, one major problem for verification of garbage collection is that traditionally we do not know how to flexibly describe and cast between both strong and

weak update reference cell, as well as any reference cell with mixed properties between strong and weak update, *e.g.*, a reference cell with at most  $n$  alias.

**High-level logic-based language** It may be desirable to write the majority of the system kernel and service in high-level languages. It will be interesting to discover how to 1) increase the expressive power of specification languages so more advanced code and properties can be included; 2) increase the level of assurance by using sound logic and checker; 3) integrate the work with other levels of logic-based verification, such as those used by runtime.

**Synchronization primitives (mutex, channel, etc.)** In the work on verifying threading in this dissertation, only basic services such as creation, yielding, and termination of threads are supported. However, in practice, the most interesting feature of thread libraries for concurrent programming is the set of synchronization primitives it provides. Which set of primitives to support and how they are implemented depends heavily on the concurrent programming model, machine mode, requirement on throughput and response time, as well as the external module interface to be used together with the library. The work will not only yield a set of certified implementations of synchronization primitives, but also track down the precise interfaces of them, which will be interesting by themselves. It might also be connected to existing works on different synchronization and communication methods, such as TLA, allowing us to do an exact comparison of different concurrency models and potentially make code that is based on different concurrency models work together.

**Security properties** As an important feature of system APIs, security features and the properties they guarantee have traditionally been enforced using either dynamic machine semantics or static semantics of security types. With the runtime and other kernel service being certified with logic specifications, it is possible to use logic formulas to describe

security policies, thus allowing security guarantees that are much more flexible.

**Logic specifications in certifying compiler** Even if we have a higher-level language with logic specifications, right now it is unclear how it would fit into the certifying compiler. Once decidable type checking and typing derivation are replaced by logic proofs, the optimization process will lose the crucial structural information it need to transform the proof. On the other hand, the ability to use arbitrary logic specifications during the compilation process might eventually provide more expressive power on the kinds of optimizations supported comparing with the traditional type-preserving compilation approach, thus producing better optimized certifying compiler.

## Appendix A

# PropX Validity Soundness Proof

In this section we give the proof structure for the soundness of *PropX* validity rules, as introduced in Section 4.1, Section 4.3, Section 5.1, and Section 6.2. To avoid confusions, we first present the complete *PropX* definition in Figure A.1 and the complete set of *PropX* validity rules in Figure A.2.

We present the soundness of *PropX* interpretation (validity rules), as previously discussed in Theorem 4.1 and Theorem 6.1, as follows.

**Theorem A.1 (Soundness of XCAP *PropX* Interpretation)**

1. If  $\llbracket \langle p \rangle \rrbracket_{\Psi, \Phi}$  then  $p$ ;
2. if  $\llbracket \text{cptr}(f, a) \rrbracket_{\Psi, \Phi}$  then  $\Psi(f) = a$ ;
3. if  $\llbracket \text{ref}(l, t) \rrbracket_{\Psi, \Phi}$  then  $\Phi(l) = t$ ;
4. if  $\llbracket P \wedge Q \rrbracket_{\Psi, \Phi}$  then  $\llbracket P \rrbracket_{\Psi, \Phi}$  and  $\llbracket Q \rrbracket_{\Psi, \Phi}$ ;
5. if  $\llbracket P \vee Q \rrbracket_{\Psi, \Phi}$  then either  $\llbracket P \rrbracket_{\Psi, \Phi}$  or  $\llbracket Q \rrbracket_{\Psi, \Phi}$ ;
6. if  $\llbracket P \rightarrow Q \rrbracket_{\Psi, \Phi}$  and  $\llbracket P \rrbracket_{\Psi, \Phi}$  then  $\llbracket Q \rrbracket_{\Psi, \Phi}$ ;
7. if  $\llbracket \forall x:A. P \rrbracket_{\Psi, \Phi}$  and  $B:A$  then  $\llbracket P[B/x] \rrbracket_{\Psi, \Phi}$ ;
8. if  $\llbracket \exists x:A. P \rrbracket_{\Psi, \Phi}$  then there exists  $B:A$  such that  $\llbracket P[B/x] \rrbracket_{\Psi, \Phi}$ ;

|                                                      |                                                 |
|------------------------------------------------------|-------------------------------------------------|
| $(PropX) P, Q ::= \langle p \rangle$                 | <i>lifted meta proposition</i>                  |
| $\text{cptr}(f, a)$                                  | <i>embedded code pointer</i>                    |
| $\text{ref}(l, t)$                                   | <i>reference cell pointer</i>                   |
| $P \wedge Q$                                         | <i>conjunction</i>                              |
| $P \vee Q$                                           | <i>disjunction</i>                              |
| $P \rightarrow Q$                                    | <i>implication</i>                              |
| $\forall x:A. P$                                     | <i>universal quantification</i>                 |
| $\exists x:A. P$                                     | <i>existential quantification</i>               |
| $\forall \alpha:A \rightarrow PropX. P$              | <i>impredicative universal quantification</i>   |
| $\exists \alpha:A \rightarrow PropX. P$              | <i>impredicative existential quantification</i> |
| $(\mu \alpha:A \rightarrow PropX. \lambda x:A. P) B$ | <i>recursive specification</i>                  |
| $(CdHpSpec) \Psi ::= \{f \rightsquigarrow a\}^*$     |                                                 |
| $(DtHpSpec) \Phi ::= \{l \rightsquigarrow t\}^*$     |                                                 |
| $(Assertion) \ a \in State \rightarrow PropX$        |                                                 |
| $(WordTy) \ t \in Word \rightarrow PropX$            |                                                 |
| $(Env) \ \Gamma ::= \cdot \mid \Gamma, P$            |                                                 |

Figure A.1: Assertion language of the full-featured XCAP

$\boxed{\Gamma \vdash_{\Psi, \Phi} P}$  (*Validity of Extended Propositions*)

(The following presentation omits the  $\Psi$  and  $\Phi$  in judgment  $\Gamma \vdash_{\Psi, \Phi} P$ .)

$$\begin{array}{c}
 \frac{P \in \Gamma}{\Gamma \vdash P} \text{ (ENV)} \quad \frac{p}{\Gamma \vdash \langle p \rangle} \text{ (}\langle \rangle\text{-I)} \quad \frac{\Gamma \vdash \langle p \rangle}{p \supset (\Gamma \vdash Q)} \quad \frac{p \supset (\Gamma \vdash Q)}{\Gamma \vdash Q} \text{ (}\langle \rangle\text{-E)} \\
 \\
 \frac{\Psi(f) = a}{\Gamma \vdash \text{cptr}(f, a)} \text{ (CP-I)} \quad \frac{\Gamma \vdash \text{cptr}(f, a)}{\Gamma \vdash Q} \quad \frac{(\Psi(f) = a) \supset (\Gamma \vdash Q)}{\Gamma \vdash Q} \text{ (CP-E)} \\
 \\
 \frac{\Phi(l) = t}{\Gamma \vdash \text{ref}(l, t)} \text{ (RF-I)} \quad \frac{\Gamma \vdash \text{ref}(l, t)}{\Gamma \vdash !Q} \quad \frac{(\Phi(l) = t) \supset (\Gamma \vdash Q)}{\Gamma \vdash Q} \text{ (RF-E)} \\
 \\
 \frac{\Gamma \vdash P \quad \Gamma \vdash Q}{\Gamma \vdash P \& Q} \text{ (}\&\text{-I)} \quad \frac{\Gamma \vdash P \& Q}{\Gamma \vdash P} \text{ (}\&\text{-E1)} \quad \frac{\Gamma \vdash P \& Q}{\Gamma \vdash Q} \text{ (}\&\text{-E2)} \\
 \\
 \frac{\Gamma \vdash P}{\Gamma \vdash P \vee Q} \text{ (}\vee\text{-I1)} \quad \frac{\Gamma \vdash Q}{\Gamma \vdash P \vee Q} \text{ (}\vee\text{-I2)} \quad \frac{\Gamma \vdash P \vee Q \quad \Gamma, P \vdash R \quad \Gamma, Q \vdash R}{\Gamma \vdash R} \text{ (}\vee\text{-E)} \\
 \\
 \frac{\Gamma, P \vdash Q}{\Gamma \vdash P \rightarrow Q} \text{ (}\rightarrow\text{-I)} \quad \frac{\Gamma \vdash P \rightarrow Q \quad \Gamma \vdash P}{\Gamma \vdash Q} \text{ (}\rightarrow\text{-E)} \\
 \\
 \frac{\Gamma \vdash P[B/x] \quad \forall B:A}{\Gamma \vdash \forall x:A. P} \text{ (}\forall\text{-I1)} \quad \frac{\Gamma \vdash \forall x:A. P \quad B:A}{\Gamma \vdash P[B/x]} \text{ (}\forall\text{-E1)} \\
 \\
 \frac{B:A \quad \Gamma \vdash P[B/x]}{\Gamma \vdash \exists x:A. P} \text{ (}\exists\text{-I1)} \quad \frac{\Gamma \vdash \exists x:A. P \quad \Gamma, P[B/x] \vdash Q \quad \forall B:A}{\Gamma \vdash Q} \text{ (}\exists\text{-E1)} \\
 \\
 \frac{\Gamma \vdash P[a/\alpha] \quad \forall a:A \rightarrow \text{PropX}}{\Gamma \vdash \forall \alpha:A \rightarrow \text{PropX}. P} \text{ (}\forall\text{-I2)} \quad \frac{a:A \rightarrow \text{PropX} \quad \Gamma \vdash P[a/\alpha]}{\Gamma \vdash \exists \alpha:A \rightarrow \text{PropX}. P} \text{ (}\exists\text{-I2)} \\
 \\
 \frac{B:A \quad \Gamma \vdash P[B/x][\mu \alpha:A \rightarrow \text{PropX}. \lambda x:A. P/\alpha]}{\Gamma \vdash (\mu \alpha:A \rightarrow \text{PropX}. \lambda x:A. P \ B)} \text{ (}\mu\text{-I)}
 \end{array}$$

Figure A.2: Validity rules for extended propositions of the full-featured XCAP

9. if  $\llbracket \forall \alpha : A \rightarrow \text{Prop}_X. P \rrbracket_{\Psi, \Phi}$  and  $\alpha : A \rightarrow \text{Prop}_X$  then  $\llbracket P[\alpha/\alpha] \rrbracket_{\Psi, \Phi}$ ;
10. if  $\llbracket \exists \alpha : A \rightarrow \text{Prop}_X. P \rrbracket_{\Psi, \Phi}$  then there exists  $\alpha : A \rightarrow \text{Prop}_X$  such that  $\llbracket P[\alpha/\alpha] \rrbracket_{\Psi, \Phi}$ ;
11. if  $\llbracket (\mu \alpha. \lambda x : A. P) B \rrbracket_{\Psi, \Phi}$  then  $\llbracket P[B/x]((\mu \alpha. \lambda x : A. P)/\alpha) \rrbracket_{\Psi, \Phi}$ .

**Corollary A.2 (XCAP Consistency)**  $\llbracket \langle \text{False} \rangle \rrbracket_{\Psi, \Phi}$  is not provable.

For the proof, we follow the syntactic normalization proof methods in Pfenning [51].

The validity of extended propositions rules of form  $\Gamma \vdash_{\Psi, \Phi} P$  in Figure A.2 are natural deduction rules. We classify them into the following two kinds (and call them altogether as **normal natural validity rules**) as shown in Figure A.3.

$\Gamma \vdash_{\Psi, \Phi} P \uparrow$  Extended Proposition P has a normal deduction, and

$\Gamma \vdash_{\Psi, \Phi} P \downarrow$  Extended Proposition P is extracted from a hypodissertation.

We define the **annotated natural deduction rules** by annotating each normal validity rules with a “+” symbol as  $\Gamma \vdash_{\Psi, \Phi}^+ P \uparrow$  and  $\Gamma \vdash_{\Psi, \Phi}^+ P \downarrow$  and adding the following coercion rule.

$$\frac{\Gamma \vdash_{\Psi, \Phi}^+ P \uparrow}{\Gamma \vdash_{\Psi, \Phi}^+ P \downarrow} (\text{COER}')$$

We then define the **sequent style validity rules** of form  $\Gamma \Rightarrow_{\Psi, \Phi} P$  in Figure A.4 and extend the sequent rules by annotating sequent judgments with a “+” as  $\Gamma \Rightarrow_{\Psi, \Phi}^+ P$  and adding the cut rule.

$$\frac{\Gamma \Rightarrow_{\Psi, \Phi}^+ P \quad \Gamma, P \Rightarrow_{\Psi, \Phi}^+ Q}{\Gamma \Rightarrow_{\Psi, \Phi}^+ Q} (\text{CUT})$$

The normalization proof process is:

$$\Gamma \vdash_{\Psi, \Phi} P \xrightarrow{A.5} \Gamma \vdash_{\Psi, \Phi}^+ P \uparrow \xrightarrow{A.9} \Gamma \Rightarrow_{\Psi, \Phi}^+ P \xrightarrow{A.11} \Gamma \Rightarrow_{\Psi, \Phi} P \xrightarrow{A.6} \Gamma \vdash_{\Psi, \Phi} P \uparrow.$$

First, natural deduction derivations are mapped to derivations in sequent validity rules with cut. Then we do cut-elimination in sequent rules, and map the new cut-free sequent derivation back to a normal natural deduction derivation. Soundness of  $\text{Prop}_X$  interpretation can then be proved, since the last rule in a normal natural deduction derivation must be one of the introduction rules. We prove the following theorems to construct

$$\boxed{\Gamma \vdash_{\Psi, \Phi} P \uparrow \quad \Gamma \vdash_{\Psi, \Phi} P \downarrow} \quad (\text{Validity of Extended Propositions})$$

(The following rules omits the  $\Psi$  and  $\Phi$  in judgments  $\Gamma \vdash_{\Psi, \Phi} P \uparrow$  and  $\Gamma \vdash_{\Psi, \Phi} P \downarrow$ .)

$$\begin{array}{c}
\frac{\Gamma \vdash P \downarrow}{\Gamma \vdash P \uparrow} \text{ (COER)} \quad \frac{P \in \Gamma}{\Gamma \vdash P \downarrow} \text{ (ENV)} \\[10pt]
\frac{p}{\Gamma \vdash \langle p \rangle \uparrow} \text{ (}\langle \rangle\text{-I)} \quad \frac{\Gamma \vdash \langle p \rangle \downarrow \quad p \supset (\Gamma \vdash Q \uparrow)}{\Gamma \vdash Q \uparrow} \text{ (}\langle \rangle\text{-E)} \\[10pt]
\frac{\Psi(f) = a}{\Gamma \vdash \text{cptr}(f, a) \uparrow} \text{ (CP-I)} \quad \frac{\Gamma \vdash \text{cptr}(f, a) \downarrow \quad (\Psi(f) = a) \supset (\Gamma \vdash Q \uparrow)}{\Gamma \vdash Q \uparrow} \text{ (CP-E)} \\[10pt]
\frac{\Phi(l) = t}{\Gamma \vdash \text{ref}(l, t) \uparrow} \text{ (RF-I)} \quad \frac{\Gamma \vdash \text{ref}(l, t) \downarrow \quad (\Phi(l) = t) \supset (\Gamma \vdash Q \uparrow)}{\Gamma \vdash Q \uparrow} \text{ (RF-E)} \\[10pt]
\frac{\Gamma \vdash P \uparrow \quad \Gamma \vdash Q \uparrow}{\Gamma \vdash P \wedge Q \uparrow} \text{ (\wedge-I)} \quad \frac{\Gamma \vdash P \wedge Q \downarrow}{\Gamma \vdash P \downarrow} \text{ (\wedge-E1)} \quad \frac{\Gamma \vdash P \wedge Q \downarrow}{\Gamma \vdash Q \downarrow} \text{ (\wedge-E2)} \\[10pt]
\frac{\Gamma \vdash P \uparrow}{\Gamma \vdash P \vee Q \uparrow} \text{ (\vee-I1)} \quad \frac{\Gamma \vdash Q \uparrow}{\Gamma \vdash P \vee Q \uparrow} \text{ (\vee-I2)} \quad \frac{\Gamma \vdash P \vee Q \downarrow \quad \Gamma, P \vdash R \uparrow \quad \Gamma, Q \vdash R \uparrow}{\Gamma \vdash R \uparrow} \text{ (\vee-E)} \\[10pt]
\frac{\Gamma, P \vdash Q \uparrow}{\Gamma \vdash P \rightarrow Q \uparrow} \text{ (\rightarrow-I)} \quad \frac{\Gamma \vdash P \rightarrow Q \downarrow \quad \Gamma \vdash P \uparrow}{\Gamma \vdash Q \downarrow} \text{ (\rightarrow-E)} \\[10pt]
\frac{\Gamma \vdash P[B/x] \uparrow \quad \forall B:A}{\Gamma \vdash \forall x:A. P \uparrow} \text{ (\forall-I1)} \quad \frac{\Gamma \vdash \forall x:A. P \downarrow \quad B:A}{\Gamma \vdash P[B/x] \downarrow} \text{ (\forall-E1)} \\[10pt]
\frac{B:A \quad \Gamma \vdash P[B/x] \uparrow}{\Gamma \vdash \exists x:A. P \uparrow} \text{ (\exists-I1)} \quad \frac{\Gamma \vdash \exists x:A. P \downarrow \quad \Gamma, P[B/x] \vdash Q \uparrow \quad \forall B:A}{\Gamma \vdash Q \uparrow} \text{ (\exists-E1)} \\[10pt]
\frac{\Gamma \vdash P[a/\alpha] \uparrow \quad \forall a:A \rightarrow \text{Prop} X}{\Gamma \vdash \forall \alpha:A \rightarrow \text{Prop} X. P \uparrow} \text{ (\forall-I2)} \quad \frac{a:A \rightarrow \text{Prop} X \quad \Gamma \vdash P[a/\alpha] \uparrow}{\Gamma \vdash \exists \alpha:A \rightarrow \text{Prop} X. P \uparrow} \text{ (\exists-I2)} \\[10pt]
\frac{B:A \quad \Gamma \vdash P[B/x][\mu \alpha:A \rightarrow \text{Prop} X. \lambda x:A. P/\alpha] \uparrow}{\Gamma \vdash (\mu \alpha:A \rightarrow \text{Prop} X. \lambda x:A. P B) \uparrow} \text{ (\mu-I)}
\end{array}$$

Figure A.3: Normal natural deduction validity rules

$\boxed{\Gamma \implies_{\Psi, \Phi} P}$  (*Validity of Extended Propositions*)

(The following rules omits the  $\Psi$  and  $\Phi$  in judgment  $\Gamma \implies_{\Psi, \Phi} P$ .)

$$\begin{array}{c}
 \frac{P \in \Gamma}{\Gamma \implies P} \text{ (INIT)} \quad \frac{p}{\Gamma \implies \langle p \rangle} \text{ (}\langle \rangle\text{-R)} \quad \frac{p \supset (\Gamma, \langle p \rangle \implies Q)}{\Gamma, \langle p \rangle \implies Q} \text{ (}\langle \rangle\text{-L)} \\
 \\ 
 \frac{\Psi(f) = a}{\Gamma \implies \text{cptr}(f, a)} \text{ (CP-R)} \quad \frac{(\Psi(f) = a) \supset (\Gamma, \text{cptr}(f, a) \implies Q)}{\Gamma, \text{cptr}(f, a) \implies Q} \text{ (CP-L)} \\
 \\ 
 \frac{\Phi(l) = t}{\Gamma \implies \text{ref}(l, t)} \text{ (RF-R)} \quad \frac{(\Phi(l) = t) \supset (\Gamma, \text{ref}(l, t) \implies Q)}{\Gamma, \text{ref}(l, t) \implies Q} \text{ (RF-L)} \\
 \\ 
 \frac{\Gamma \implies P \quad \Gamma \implies Q}{\Gamma \implies P \& Q} \text{ (\&-R)} \quad \frac{\Gamma, P \& Q, P \implies R}{\Gamma, P \& Q \implies R} \text{ (\&-L1)} \quad \frac{\Gamma, P \& Q, Q \implies R}{\Gamma, P \& Q \implies R} \text{ (\&-L2)} \\
 \\ 
 \frac{\Gamma \implies P}{\Gamma \implies P \vee Q} \text{ (}\vee\text{-R1)} \quad \frac{\Gamma \implies Q}{\Gamma \implies P \vee Q} \text{ (}\vee\text{-R2)} \quad \frac{\Gamma, P \vee Q, P \implies R \quad \Gamma, P \vee Q, Q \implies R}{\Gamma, P \vee Q \implies R} \text{ (}\vee\text{-L)} \\
 \\ 
 \frac{\Gamma, P \implies Q}{\Gamma \implies P \rightarrow Q} \text{ (}\rightarrow\text{-R)} \quad \frac{\Gamma, P \multimap Q \implies P \quad \Gamma, P \multimap Q, P \implies R}{\Gamma, P \multimap Q \implies R} \text{ (}\multimap\text{-L)} \\
 \\ 
 \frac{\Gamma \implies P[B/x] \quad \forall B:A}{\Gamma \implies \forall x:A.P} \text{ (}\forall\text{-R1)} \quad \frac{\Gamma, \forall x:A.P, P[B/x] \implies Q \quad B:A}{\Gamma, \forall x:A.P \implies Q} \text{ (}\forall\text{-L1)} \\
 \\ 
 \frac{B:A \quad \Gamma \implies P[B/x]}{\Gamma \implies \exists x:A.P} \text{ (}\exists\text{-R1)} \quad \frac{\Gamma, \exists x:A.P, P[B/x] \implies Q \quad \forall B:A}{\Gamma, \exists x:A.P \implies Q} \text{ (}\exists\text{-L1)} \\
 \\ 
 \frac{\Gamma \implies P[a/\alpha] \quad \forall a:A \rightarrow \text{Prop}X}{\Gamma \implies \forall \alpha:A \rightarrow \text{Prop}X.P} \text{ (}\forall\text{-R2)} \quad \frac{a:A \rightarrow \text{Prop}X \quad \Gamma \implies P[a/\alpha]}{\Gamma \implies \exists \alpha:A \rightarrow \text{Prop}X.P} \text{ (}\exists\text{-R2)} \\
 \\ 
 \frac{B:A \quad \Gamma \implies P[B/x][\mu \alpha:A \rightarrow \text{Prop}X. \lambda x:A.P/\alpha]}{\Gamma \implies (\mu \alpha:A \rightarrow \text{Prop}X. \lambda x:A.P \ B)} \text{ (\mu-R)}
 \end{array}$$

Figure A.4: Sequent style validity rules

the proof. We use structural induction in most of the proof. The full proof has been mechanized in the Coq proof assistant.

**Theorem A.3 (Soundness of Normal Deductions)**

1. If  $\Gamma \vdash_{\Psi, \Phi} P \uparrow$  then  $\Gamma \vdash_{\Psi, \Phi} P$ , and
2. if  $\Gamma \vdash_{\Psi, \Phi} P \downarrow$  then  $\Gamma \vdash_{\Psi, \Phi} P$ .

**Theorem A.4 (Soundness of Annotated Deductions)**

1. If  $\Gamma \vdash_{\Psi, \Phi}^+ P \uparrow$  then  $\Gamma \vdash_{\Psi, \Phi} P$ , and
2. if  $\Gamma \vdash_{\Psi, \Phi}^+ P \downarrow$  then  $\Gamma \vdash_{\Psi, \Phi} P$ .

**Theorem A.5 (Completeness of Annotated Deductions)**

1. If  $\Gamma \vdash_{\Psi, \Phi} P$  then  $\Gamma \vdash_{\Psi, \Phi}^+ P \uparrow$ , and
2. if  $\Gamma \vdash_{\Psi, \Phi} P$  then  $\Gamma \vdash_{\Psi, \Phi}^+ P \downarrow$ .

**Theorem A.6 (Soundness of Sequent Calculus)**

If  $\Gamma \implies_{\Psi, \Phi} P$  then  $\Gamma \vdash_{\Psi, \Phi} P \uparrow$ .

**Theorem A.7 (Completeness of Sequent Derivations)**

1. If  $\Gamma \vdash_{\Psi, \Phi} P \uparrow$  then  $\Gamma \implies_{\Psi, \Phi} P$ , and
2. if  $\Gamma \vdash_{\Psi, \Phi} P \downarrow$  and  $\Gamma, P \implies_{\Psi, \Phi} Q$  then  $\Gamma \implies_{\Psi, \Phi} Q$ .

**Theorem A.8 (Soundness of Sequent Calculus with Cut)**

If  $\Gamma \implies_{\Psi, \Phi}^+ P$  then  $\Gamma \vdash_{\Psi, \Phi}^+ P \uparrow$ .

**Theorem A.9 (Completeness of Sequent Calculus with Cut)**

1. If  $\Gamma \vdash_{\Psi, \Phi}^+ P \uparrow$  then  $\Gamma \implies_{\Psi, \Phi}^+ P$ , and
2. if  $\Gamma \vdash_{\Psi, \Phi}^+ P \downarrow$  and  $\Gamma, P \implies_{\Psi, \Phi}^+ Q$  then  $\Gamma \implies_{\Psi, \Phi}^+ Q$ .

**Theorem A.10 (Admissibility of Cut)**

If  $\Gamma \implies_{\Psi, \Phi} P$  and  $\Gamma, P \implies_{\Psi, \Phi} Q$  then  $\Gamma \implies_{\Psi, \Phi} Q$ .

**Theorem A.11 (Cut Elimination)**

If  $\Gamma \implies_{\Psi, \Phi}^+ P$  then  $\Gamma \implies_{\Psi, \Phi} P$ .

**Theorem A.12 (Normalization for Natural Deduction)**

If  $\Gamma \vdash_{\Psi, \Phi} P$  then  $\Gamma \vdash_{\Psi, \Phi} P \uparrow$ .

A special form of the above theorem is “if  $\llbracket P \rrbracket_{\Psi, \Phi}$  then  $\cdot \vdash_{\Psi, \Phi} P \uparrow$ ”. The soundness of *PropX* interpretation (Theorem A.1) can be proved using this theorem.

## Appendix B

# XCAP Soundness Proof

Due to the usage of simple semantic subsumptions ( $\Rightarrow$  and  $\Rightarrow_C$ , defined using logical implication  $\supset$ , as in Figure 3.3 and Figure 6.4), it is straight-forward to prove the soundness of XCAP inference rules using the syntactic soundness approach proposed by Wright and Felleisen [59]. In this section, we present the proof sketch for the soundness of XCAP, as discussed in Chapter 3, Chapter 4, and Chapter 5.

### Lemma B.1 (XCAP Instruction Sequence Weakening)

If  $\Psi \vdash \{a\} \mathbb{I}$ ,  $\Psi \subseteq \Psi'$ , and  $a' \Rightarrow a$  then  $\Psi' \vdash \{a'\} \mathbb{I}$ .

Proof Sketch. The proof is by induction over the derivation  $\Psi \vdash \{a\} \mathbb{I}$ , named as  $\mathcal{D}$ .

Case SEQ. The derivation  $\mathcal{D}$  has the form

$$\frac{a \Rightarrow_C a'' \quad \Psi \vdash \{a''\} \mathbb{I}'}{\Psi \vdash \{a\} c; \mathbb{I}'}$$

We have

1. from  $a \Rightarrow a'$  and  $a \Rightarrow_C a''$  it follows that  $a' \Rightarrow_C a''$ ;
2. from  $\Psi \vdash \{a''\} \mathbb{I}'$  by the induction hypothesis it follows that  $\Psi' \vdash \{a''\} \mathbb{I}'$ .

Case JD. The derivation  $\mathcal{D}$  has the form

$$\frac{a \Rightarrow \Psi(f) \quad f \in \text{dom}(\Psi)}{\Psi \vdash \{a\} \text{jd } f}$$

From  $a \Rightarrow a'$  and  $a \Rightarrow \Psi(f)$  it follows that  $a' \Rightarrow \Psi(f)$ .

Case BGTI. The derivation  $\mathcal{D}$  has the form

$$\frac{\begin{array}{c} (\lambda(\mathbb{H}, \mathbb{R}). \langle \mathbb{R}(r_s) \leq i \rangle \wedge a(\mathbb{H}, \mathbb{R})) \Rightarrow a'' \quad \Psi \vdash \{a''\} \mathbb{I}' \\ (\lambda(\mathbb{H}, \mathbb{R}). \langle \mathbb{R}(r_s) > i \rangle \wedge a(\mathbb{H}, \mathbb{R})) \Rightarrow \Psi(f) \quad f \in \text{dom}(\Psi) \end{array}}{\Psi \vdash \{a\} \text{bgti } r_s, i, f; \mathbb{I}'}$$

We have

1. from  $a \Rightarrow a'$  and  $(\lambda(\mathbb{H}, \mathbb{R}). \langle \mathbb{R}(r_s) \leq i \rangle \wedge a(\mathbb{H}, \mathbb{R})) \Rightarrow a''$  it follows that

$$(\lambda(\mathbb{H}, \mathbb{R}). \langle \mathbb{R}(r_s) \leq i \rangle \wedge a'(\mathbb{H}, \mathbb{R})) \Rightarrow a'';$$

2. from  $\Psi \vdash \{a''\} \mathbb{I}'$  and the induction hypodissertation it follows that  $\Psi' \vdash \{a''\} \mathbb{I}'$ ;

3. from  $a \Rightarrow a'$  and  $(\lambda(\mathbb{H}, \mathbb{R}). \langle \mathbb{R}(r_s) > i \rangle \wedge a(\mathbb{H}, \mathbb{R})) \Rightarrow \Psi(f)$  it follows that

$$(\lambda(\mathbb{H}, \mathbb{R}). \langle \mathbb{R}(r_s) > i \rangle \wedge a'(\mathbb{H}, \mathbb{R})) \Rightarrow \Psi(f).$$

Case JMP. The derivation  $\mathcal{D}$  has the form

$$\frac{a \Rightarrow (\lambda(\mathbb{H}, \mathbb{R}). a''(\mathbb{H}, \mathbb{R}) \wedge \text{cptr}(\mathbb{R}(r), a'')))}{\Psi \vdash \{a\} \text{jmp } r}$$

From  $a \Rightarrow a'$  and  $a \Rightarrow (\lambda(\mathbb{H}, \mathbb{R}). a''(\mathbb{H}, \mathbb{R}) \wedge \text{cptr}(\mathbb{R}(r), a''))$  it follows that

$$a' \Rightarrow (\lambda(\mathbb{H}, \mathbb{R}). a''(\mathbb{H}, \mathbb{R}) \wedge \text{cptr}(\mathbb{R}(r), a'')).$$

Case ECP. The derivation  $\mathcal{D}$  has the form

$$\frac{(\lambda \mathbb{S}. \text{cptr}(f, \Psi(f)) \wedge a \mathbb{S}) \Rightarrow a'' \quad f \in \text{dom}(\Psi) \quad \Psi \vdash \{a''\} \mathbb{I}}{\Psi \vdash \{a\} \mathbb{I}}$$

We have

1. from  $a \Rightarrow a'$  and  $(\lambda \mathbb{S}. \text{cptr}(f, \Psi(f)) \wedge a \mathbb{S}) \Rightarrow a''$  it follows that

$$(\lambda \mathbb{S}. \text{cptr}(f, \Psi(f)) \wedge a' \mathbb{S}) \Rightarrow a'';$$

2. from  $\Psi \vdash \{a''\} \mathbb{I}$  by the induction hypodissertation it follows that  $\Psi' \vdash \{a''\} \mathbb{I}$ . ■

### Lemma B.2 (XCAP Code Heap Typing)

If  $\Psi_{IN} \vdash \mathbb{C} : \Psi$  and  $f \in \text{dom}(\Psi)$  then  $f \in \text{dom}(\mathbb{C})$  and  $\Psi_{IN} \vdash \{\Psi(f)\} \mathbb{C}(f)$ .

Proof Sketch. The proof is by induction over the derivation  $\Psi_{IN} \vdash \mathbb{C} : \Psi$ .

Case CDHP. The derivation has the form

$$\frac{\Psi_{IN} \vdash \{a_i\} \mathbb{I}_i \quad \forall f_i}{\Psi_{IN} \vdash \{f_1 \rightsquigarrow \mathbb{I}_1, \dots, f_n \rightsquigarrow \mathbb{I}_n\} : \{f_1 \rightsquigarrow a_1, \dots, f_n \rightsquigarrow a_n\}}$$

From  $f \in \text{dom}(\Psi)$  it follows that  $f \in \text{dom}(\mathbb{C})$  and  $\Psi_{IN} \vdash \{\Psi(f)\} \mathbb{C}(f)$ .

**Case** LINK. The derivation has the form

$$\frac{\begin{array}{c} \Psi_{IN1} \vdash \mathbb{C}_1 : \Psi_1 \quad \Psi_{IN2} \vdash \mathbb{C}_2 : \Psi_2 \quad \Psi_{IN1}(f) = \Psi_{IN2}(f) \\ \text{dom}(\mathbb{C}_1) \cap \text{dom}(\mathbb{C}_2) = \emptyset \quad \forall f \in \text{dom}(\Psi_{IN1}) \cap \text{dom}(\Psi_{IN2}) \end{array}}{\Psi_{IN1} \cup \Psi_{IN2} \vdash \mathbb{C}_1 \cup \mathbb{C}_2 : \Psi_1 \cup \Psi_2}$$

From  $\text{dom}(\mathbb{C}_1) \cap \text{dom}(\mathbb{C}_2) = \emptyset$  it follows that  $\text{dom}(\Psi_1) \cap \text{dom}(\Psi_2) = \emptyset$ . Then from  $f \in \text{dom}(\Psi)$  it follows that  $f \in \text{dom}(\Psi_i)$ , where  $i$  could be either 1 or 2. From  $\Psi_{INi} \vdash \mathbb{C}_i : \Psi_i$  by the induction hypodissertation it follows that  $f \in \text{dom}(\mathbb{C}_i)$  and  $\Psi_{INi} \vdash \{\Psi_i(f)\} \mathbb{C}_i(f)$ .

From  $f \in \text{dom}(\mathbb{C}_i)$  it follows that  $f \in \text{dom}(\mathbb{C}_1 \cup \mathbb{C}_2)$ .

From  $\Psi_{INi} \vdash \{\Psi_i(f)\} \mathbb{C}_i(f)$  and Instruction Sequence Weakening (Lemma B.1) it follows that  $\Psi_{IN1} \cup \Psi_{IN2} \vdash \{\Psi_1 \cup \Psi_2(f)\} \mathbb{C}_1 \cup \mathbb{C}_2(f)$ . ■

### Lemma B.3 (XCAP State Typing)

If  $\Psi_{IN} \vdash \mathbb{C} : \Psi$  and  $\llbracket \text{cptr}(f, a) \rrbracket_{\Psi}$  then  $f \in \text{dom}(\mathbb{C})$  and  $\Psi_{IN} \vdash \{a\} \mathbb{C}(f)$ .

**Proof Sketch.** From  $\llbracket \text{cptr}(f, a) \rrbracket_{\Psi}$  by *PropX* Validity Soundness (Lemma A.1) it follows that  $f \in \text{dom}(\Psi)$  and  $\Psi(f) = a$ . By Code Heap Typing (Lemma B.2) it follows that  $f \in \text{dom}(\mathbb{C})$  and  $\Psi_{IN} \vdash \{a\} \mathbb{C}(f)$ . ■

### Lemma B.4 (XCAP Progress)

If  $\Psi_G \vdash \{a\} \mathbb{P}$ , then there exists a program  $\mathbb{P}'$  such that  $\mathbb{P} \longrightarrow \mathbb{P}'$ .

**Proof Sketch.** Derivation  $\Psi_G \vdash \{a\} \mathbb{P}$  has the following form.

$$\frac{\Psi_G \vdash \mathbb{C} : \Psi_G \quad (\llbracket a \rrbracket_{\Psi_G} \mathbb{S}) \quad \Psi_G \vdash \{a\} \mathbb{I}}{\Psi_G \vdash \{a\} (\mathbb{C}, \mathbb{S}, \mathbb{I})}$$

The proof is by induction over derivation  $\Psi_G \vdash \{a\} \mathbb{I}$ .

**Case** SEQ, JD, BGTI and JMP. the proof is by simple inspections.

**Case** ECP. The proof is by the induction hypodissertation. ■

### Lemma B.5 (XCAP Preservation)

If  $\Psi_G \vdash \{a\} \mathbb{P}$  and  $\mathbb{P} \longrightarrow \mathbb{P}'$  then there exists an assertion  $a'$  such that  $\Psi_G \vdash \{a'\} \mathbb{P}'$ .

**Proof Sketch.** Suppose  $\mathbb{P} = (\mathbb{C}, (\mathbb{H}, \mathbb{R}), \mathbb{I})$ . We name derivation  $\Psi_G \vdash \{a\} \mathbb{P}$  as  $\mathcal{D}$ , which has the following form.

$$\frac{\Psi_G \vdash \mathbb{C} : \Psi_G \quad (\llbracket a \rrbracket_{\Psi_G} (\mathbb{H}, \mathbb{R})) \quad \Psi_G \vdash \{a\} \mathbb{I}}{\Psi_G \vdash \{a\} (\mathbb{C}, (\mathbb{H}, \mathbb{R}), \mathbb{I})}$$

The proof is by induction over the derivation  $\Psi_G \vdash \{a\} \mathbb{I}$ , named as  $\mathcal{E}$ .

**Case SEQ.** The derivation  $\mathcal{E}$  has the form

$$\frac{a \Rightarrow_C a' \quad \Psi_G \vdash \{a'\} \mathbb{I}'}{\Psi_G \vdash \{a\} c; \mathbb{I}'}$$

By the operational semantics,  $\mathbb{P}' = (\mathbb{C}, \text{Next}_C(\mathbb{H}, \mathbb{R}), \mathbb{I}')$ . Then

1.  $\Psi_G \vdash \mathbb{C} : \Psi_G$  is in  $\mathcal{D}$ ;
2. from  $(\llbracket a \rrbracket_{\Psi_G} (\mathbb{H}, \mathbb{R}))$  and  $a \Rightarrow_C a'$  it follows that  $(\llbracket a' \rrbracket_{\Psi_G} \text{Next}_C(\mathbb{H}, \mathbb{R}))$ ;
3.  $\Psi_G \vdash \{a'\} \mathbb{I}'$  is in  $\mathcal{E}$ .

**Case JD.** The derivation  $\mathcal{E}$  has the form

$$\frac{a \Rightarrow \Psi_G(f) \quad f \in \text{dom}(\Psi_G)}{\Psi_G \vdash \{a\} \text{jd } f}$$

By the operational semantics,  $\mathbb{P}' = (\mathbb{C}, (\mathbb{H}, \mathbb{R}), \mathbb{C}(f))$ . Then

1.  $\Psi_G \vdash \mathbb{C} : \Psi_G$  is in  $\mathcal{D}$ ;
2. from  $(\llbracket a \rrbracket_{\Psi_G} (\mathbb{H}, \mathbb{R}))$  and  $a \Rightarrow \Psi_G(f)$  it follows that  $(\llbracket \Psi_G(f) \rrbracket_{\Psi_G} (\mathbb{H}, \mathbb{R}))$ ;
3. by  $\Psi_G \vdash \mathbb{C} : \Psi_G$ ,  $f \in \text{dom}(\Psi_G)$ , and Code Heap Typing (Lemma B.2) it follows that  $\Psi_G \vdash \{\Psi_G(f)\} \mathbb{C}(f)$ .

**Case BGTI.** The derivation  $\mathcal{E}$  has the form

$$\frac{\begin{array}{c} (\lambda(\mathbb{H}, \mathbb{R}). \langle \mathbb{R}(r_s) \leq i \rangle \wedge a (\mathbb{H}, \mathbb{R})) \Rightarrow a' \quad \Psi_G \vdash \{a'\} \mathbb{I}' \\ (\lambda(\mathbb{H}, \mathbb{R}). \langle \mathbb{R}(r_s) > i \rangle \wedge a (\mathbb{H}, \mathbb{R})) \Rightarrow \Psi_G(f) \quad f \in \text{dom}(\Psi_G) \end{array}}{\Psi_G \vdash \{a\} \text{bgti } r_s, i, f; \mathbb{I}'}$$

By the operational semantics, when  $\mathbb{R}(r_s) > i$ ,  $\mathbb{P}' = (\mathbb{C}, (\mathbb{H}, \mathbb{R}), \mathbb{C}(f))$ . Then

1.  $\Psi_G \vdash \mathbb{C} : \Psi_G$  is in  $\mathcal{D}$ ;
2. from  $([\![a]\!]_{\Psi_G} (\mathbb{H}, \mathbb{R}))$  and  $(\lambda(\mathbb{H}, \mathbb{R}). \langle \mathbb{R}(r_s) \leq i \rangle \wedge a(\mathbb{H}, \mathbb{R})) \Rightarrow a'$  it follows that  $([\![\Psi_G(f)]]\!]_{\Psi_G} (\mathbb{H}, \mathbb{R})$ ;
3. by  $\Psi_G \vdash \mathbb{C} : \Psi_G$ ,  $f \in \text{dom}(\Psi_G)$ , and Code Heap Typing (Lemma B.2) it follows that  $\Psi_G \vdash \{\Psi_G(f)\} \mathbb{C}(f)$ .

By the operational semantics, when  $\mathbb{R}(r_s) \leq i$ ,  $\mathbb{P}' = (\mathbb{C}, (\mathbb{H}, \mathbb{R}), \mathbb{I}')$ . Then

1.  $\Psi_G \vdash \mathbb{C} : \Psi_G$  is in  $\mathcal{D}$ ;
2. from  $([\![a]\!]_{\Psi_G} (\mathbb{H}, \mathbb{R}))$  and  $(\lambda(\mathbb{H}, \mathbb{R}). \langle \mathbb{R}(r_s) \leq i \rangle \wedge a(\mathbb{H}, \mathbb{R})) \Rightarrow a'$  it follows that  $([\![a']]\!]_{\Psi_G} (\mathbb{H}, \mathbb{R})$ ;
3.  $\Psi_G \vdash \{a'\} \mathbb{I}'$  is in  $\mathcal{E}$ .

Case JMP. The derivation  $\mathcal{E}$  has the form

$$\frac{a \Rightarrow (\lambda(\mathbb{H}, \mathbb{R}). a' (\mathbb{H}, \mathbb{R}) \wedge \text{cptr}(\mathbb{R}(r), a'))}{\Psi_G \vdash \{a\} \text{jmp } r}$$

From  $([\![a]\!]_{\Psi_G} (\mathbb{H}, \mathbb{R}))$  and  $a \Rightarrow (\lambda(\mathbb{H}, \mathbb{R}). a' (\mathbb{H}, \mathbb{R}) \wedge \text{cptr}(\mathbb{R}(r), a'))$  it follows that

$[\![a']]\!]_{\Psi_G} (\mathbb{H}, \mathbb{R})$  and  $[\![\text{cptr}(\mathbb{R}(r), a')]\!]_{\Psi_G}$ .

By the operational semantics,  $\mathbb{P}' = (\mathbb{C}, (\mathbb{H}, \mathbb{R}), \mathbb{C}(\mathbb{R}(r)))$ . Then

1.  $\Psi_G \vdash \mathbb{C} : \Psi_G$  is in  $\mathcal{D}$ ;
2.  $([\![a']]\!]_{\Psi_G} (\mathbb{H}, \mathbb{R})$  is from above;
3. by  $\Psi_G \vdash \mathbb{C} : \Psi_G$ ,  $[\![\text{cptr}(\mathbb{R}(r), a')]\!]_{\Psi_G}$ , and State Typing (Lemma B.3) it follows that  $\Psi_G \vdash \{a'\} \mathbb{C}(\mathbb{R}(r))$ .

Case ECP. The derivation  $\mathcal{E}$  has the form

$$\frac{(\lambda \mathbb{S}. \text{cptr}(f, APP\Psi_G f) \wedge a \mathbb{S}) \Rightarrow a' \quad f \in \text{dom}(\Psi_G) \quad \Psi_G \vdash \{a'\} \mathbb{I}}{\Psi_G \vdash \{a\} \mathbb{I}}$$

From  $([\![a]\!]_{\Psi_G} (\mathbb{H}, \mathbb{R})), (\lambda S. \text{cptr}(f, \Psi_G(f)) \wedge a \in S) \Rightarrow a'$ , and  $f \in \text{dom}(\Psi_G)$  it follows that  $([\![a']]\!]_{\Psi_G} (\mathbb{H}, \mathbb{R})$ . Together with  $\Psi_G \vdash \{a'\} \mathbb{I}$ , the prove is completed by using the induction hypodissertation.  $\blacksquare$

**Theorem B.6 (XCAP Soundness)**

If  $\Psi_G \vdash \{a\} \mathbb{P}$ , then for all natural number  $n$ , there exists a program  $\mathbb{P}'$  such that  $\mathbb{P} \xrightarrow{n} \mathbb{P}'$ .

Proof Sketch. By simple induction on  $n$  and using Progress (Lemma B.4) and Perservation (Lemma B.5).  $\blacksquare$

## Appendix C

# TAL to XCAP Translation Typing Preservation Proof

In this section, we present the proof for the typing preservation theorem of the translation from TAL to XCAP in Chapter 5. Here TAL is as defined in Section 5.2, not the “semantic” one in Section 5.3.

### Lemma C.1 (Value Typing Preservation)

If  $\Psi; \Phi \vdash_{\text{TAL}} w : \tau$  then  $[\![\tau \triangleright w]\!]_{\Gamma\Psi\triangleright, \Gamma\Phi\triangleright}$ .

Proof Sketch. By induction on derivation  $\Psi; \Phi \vdash_{\text{TAL}} w : \tau$ .

Case INT. The derivation has the form

$$\overline{\Psi; \Phi \vdash_{\text{TAL}} w : \text{int}}$$

We have  $\lceil \text{int} \rceil = \lambda w. \text{True}$ , Trivial.

Case CODE. The derivation has the form

$$\frac{f \in \text{dom}(\Psi)}{\Psi; \Phi \vdash_{\text{TAL}} f : \text{code } \Psi(f)}$$

We have  $\lceil \text{code } \Psi(f) \rceil = \lambda w. \text{codeptr}(w, \lceil \Psi(f) \rceil) = \lambda w. \text{codeptr}(w, \lceil \Psi \triangleright(f) \rceil)$ , Trivial.

Case POLY. The derivation has the form

$$\frac{\cdot \vdash \tau' \quad \Psi; \Phi \vdash_{\text{TAL}} f : \text{code } [\alpha, \Delta]. \Gamma}{\Psi; \Phi \vdash_{\text{TAL}} f : \text{code } [\Delta]. \Gamma[\tau'/\alpha]}$$

We have  $\lceil \text{code } [\Delta]. \Gamma[\tau'/\alpha] \rceil = \lambda w. \text{codeptr}(w, \lceil \Delta]. \Gamma[\tau'/\alpha] \rceil) = \lambda w. \text{codeptr}(w, \exists \Delta. \lceil \Gamma[\tau'/\alpha] \rceil)$ .

By the induction hypodissertation it follows that  $\llbracket \lceil \text{code } [\alpha, \Delta]. \Gamma \rceil w \rrbracket \rceil_{\Psi \cap, \Gamma \Phi \cap}$ , and then

$$\llbracket \text{codeptr}(w, \exists \alpha, \Delta. \lceil \Gamma \rceil) \rrbracket \rceil_{\Psi \cap, \Gamma \Phi \cap}.$$

It is easy to show  $\llbracket \forall S. \exists \Delta. \lceil \Gamma[\tau'/\alpha] \rceil S \rightarrow \exists \alpha, \Delta. \lceil \Gamma \rceil S \rrbracket \rceil_{\Psi \cap, \Gamma \Phi \cap}$ . It then follows that

$$\llbracket \text{codeptr}(w, \exists \Delta. \lceil \Gamma[\tau'/\alpha] \rceil) \rrbracket \rceil_{\Psi \cap, \Gamma \Phi \cap}.$$

**Case TUPLE.** The derivation has the form

$$\frac{\Phi(1+i-1)=\tau_i \quad \forall i}{\Psi; \Phi \vdash_{\text{TAL}} 1 : \langle \tau_1, \dots, \tau_n \rangle}$$

$$\begin{aligned} \text{We have } \lceil \langle \tau_1, \dots, \tau_n \rangle \rceil &= \lambda w. \text{record}(w, \lceil \tau_1 \rceil, \dots, \lceil \tau_n \rceil) \\ &= \lambda w. \text{ref}(w, \lceil \tau_1 \rceil) \wedge \dots \wedge \text{ref}(w+n-1, \lceil \tau_n \rceil). \\ &= \lambda w. \text{ref}(w, \lceil \Phi(w) \rceil) \wedge \dots \wedge \text{ref}(w+n-1, \lceil \Phi(w+n-1) \rceil). \\ &= \lambda w. \text{ref}(w, \lceil \Phi(w) \rceil) \wedge \dots \wedge \text{ref}(w+n-1, \lceil \Phi(w+n-1) \rceil). \end{aligned}$$

Trivial.

**Case EXT.** The derivation has the form

$$\frac{\cdot \vdash \tau' \quad \Psi; \Phi \vdash_{\text{TAL}} w : \tau[\tau'/\alpha]}{\Psi; \Phi \vdash_{\text{TAL}} w : \exists \alpha. \tau}$$

We have  $\lceil \exists \alpha. \tau \rceil = \lambda w. \exists \alpha. \lceil \tau \rceil w$ . By the induction hypodissertation it follows that

$$\llbracket \lceil \tau[\tau'/\alpha] \rceil w \rrbracket \rceil_{\Psi \cap, \Gamma \Phi \cap}. \text{ It then follows that } \llbracket \exists \alpha. \lceil \tau \rceil w \rrbracket \rceil_{\Psi \cap, \Gamma \Phi \cap}.$$

**Case ALL.** The derivation has the form

$$\frac{\Psi; \Phi \vdash_{\text{TAL}} w : \tau[\mu \alpha. \tau / \alpha]}{\Psi; \Phi \vdash_{\text{TAL}} w : \mu \alpha. \tau}$$

We have  $\lceil \mu \alpha. \tau \rceil = \lambda w. (\mu \alpha. \lambda x. (\lceil \tau \rceil x) \ w)$ . By the induction hypodissertation it follows

$$\llbracket \lceil \tau[\mu \alpha. \tau / \alpha] \rceil w \rrbracket \rceil_{\Psi \cap, \Gamma \Phi \cap}. \text{ It then follows that } \llbracket (\mu \alpha. \lambda x. (\lceil \tau \rceil x) \ w) \rrbracket \rceil_{\Psi \cap, \Gamma \Phi \cap}. \blacksquare$$

### Lemma C.2 (Register File Typing Preservation)

If  $\Psi; \Phi \vdash_{\text{TAL}} \mathbb{R} : \Gamma$  then  $\llbracket \lceil \mathbb{R} \rceil. \Gamma \rceil (\mathbb{H}, \mathbb{R}) \rrbracket \rceil_{\Psi \cap, \Gamma \Phi \cap}$ .

**Proof Sketch.** Derivation  $\Psi; \Phi \vdash_{\text{TAL}} \mathbb{R} : \Gamma$  has the form

$$\frac{\Psi; \Phi \vdash_{\text{TAL}} \mathbb{R}(r_i) : \tau_i \quad \forall i}{\Psi; \Phi \vdash_{\text{TAL}} \mathbb{R} : \{r_1 \rightsquigarrow \tau_1, \dots, r_n \rightsquigarrow \tau_n\}}$$

We have  $\Gamma[].\Gamma^\neg = \lambda(\mathbb{H}, \mathbb{R}).(\Gamma\tau_1^\neg \mathbb{R}(r_1)) \wedge \dots \wedge (\Gamma\tau_n^\neg \mathbb{R}(r_n))$ .

From  $\Psi; \Phi \vdash_{\text{TAL}} \mathbb{R}(r_i):\tau_i$  by Value Typing Preservation (Lemma C.1) it follows that

$$[\Gamma\tau_i^\neg \mathbb{R}(r_i)]_{\Gamma\Psi^\neg, \Gamma\Phi^\neg}.$$

### Lemma C.3 (Data Heap Typing Preservation)

If  $\Psi \vdash_{\text{TAL}} \mathbb{H}:\Phi$  then  $\mathcal{D}\mathcal{H} \vdash \Psi^\neg \Gamma \Phi^\neg \mathbb{H}$ .

Proof Sketch. Derivation  $\Psi \vdash_{\text{TAL}} \mathbb{H}:\Phi$  has the form

$$\frac{\Psi; \Phi \vdash_{\text{TAL}} \mathbb{H}(1):\Phi(1) \quad \forall 1 \in \text{dom}(\Phi) = \text{dom}(\mathbb{H})}{\Psi \vdash_{\text{TAL}} \mathbb{H}:\Phi}$$

We have  $\mathcal{D}\mathcal{H} \vdash \Psi^\neg \Gamma \Phi^\neg \mathbb{H} = \forall 1 \in \text{dom}(\Phi) = \text{dom}(\mathbb{H}). [\Phi(1) \mathbb{H}(1)]_{\Psi, \Phi}$ .

From  $\Psi; \Phi \vdash_{\text{TAL}} \mathbb{H}(1):\Phi(1)$  by Value Typing Preservation (Lemma C.1) it follows that

$$[\Gamma\Phi(1)^\neg \mathbb{H}(1)]_{\Gamma\Psi^\neg, \Gamma\Phi^\neg}.$$

### Lemma C.4 (State Typing Preservation)

If  $\Psi \vdash_{\text{TAL}} \mathbb{S}:[\Delta].\Gamma$  then  $[\Gamma[\Delta].\Gamma^\neg]_{\Gamma\Psi^\neg} \mathbb{S}$ .

Proof Sketch. Derivation  $\Psi \vdash_{\text{TAL}} \mathbb{S}:[\Delta].\Gamma$  has the form

$$\frac{\cdot \vdash \tau_i \quad \forall i \quad \Psi \vdash_{\text{TAL}} \mathbb{H}:\Phi \quad \Psi; \Phi \vdash_{\text{TAL}} \mathbb{R}:\Gamma[\tau_1, \dots, \tau_n/\alpha_1, \dots, \alpha_n]}{\Psi \vdash_{\text{TAL}} (\mathbb{H}, \mathbb{R}):\alpha_1, \dots, \alpha_n.\Gamma}$$

From  $\Psi \vdash_{\text{TAL}} \mathbb{H}:\Phi$  by Data Heap Typing Preservation (Lemma C.3) it follows that

$$\mathcal{D}\mathcal{H} \vdash \Psi^\neg \Gamma \Phi^\neg \mathbb{H}.$$

From  $\Psi; \Phi \vdash_{\text{TAL}} \mathbb{R}:\Gamma[\tau_1, \dots, \tau_n/\alpha_1, \dots, \alpha_n]$  by Register File Typing Preservation

(Lemma C.2) it follows that  $[\Gamma[\tau_1, \dots, \tau_n/\alpha_1, \dots, \alpha_n]^\neg (\{\}, \mathbb{R})]_{\Gamma\Psi^\neg, \Gamma\Phi^\neg}$ . It then follows that

$$[\Gamma[\alpha_1, \dots, \alpha_n].\Gamma^\neg (\{\}, \mathbb{R})]_{\Gamma\Psi^\neg, \Gamma\Phi^\neg}.$$

Finally, it follows that  $[\Gamma[\alpha_1, \dots, \alpha_n].\Gamma^\neg]_{\Gamma\Psi^\neg} (\mathbb{H}, \mathbb{R})$ .

### Lemma C.5 (Subtyping Preservation)

If  $\vdash_{\text{TAL}} [\Delta].\Gamma \leq [\Delta'].\Gamma'$  then  $\vdash [\Delta].\Gamma^\neg \Rightarrow [\Delta'].\Gamma'^\neg$ .

Proof Sketch. By casing derivation  $\vdash_{\text{TAL}} [\Delta].\Gamma \leq [\Delta'].\Gamma'$ .

Case SUBT. The derivation has the form

$$\frac{\Delta \supseteq \Delta' \quad \forall r \in \text{dom}(\Gamma') \quad \Gamma(r) = \Gamma'(r) \quad \Delta' \vdash \Gamma'(r)}{\vdash_{\text{TAL}} [\Delta].\Gamma \leq [\Delta'].\Gamma'}$$

We have  $\vdash [\Delta].\Gamma \vdash = \lambda S. \exists \Delta. \vdash \Gamma \vdash S$  and  $\vdash [\Delta'].\Gamma' \vdash = \lambda S. \exists \Delta'. \vdash \Gamma' \vdash S$ . Trivial.

**Case TAPP.** The derivation has the form

$$\frac{\Gamma(r) = \text{code } [\alpha, \Delta'].\Gamma' \quad \Delta \vdash \tau'}{\vdash_{\text{TAL}} [\Delta].\Gamma \leq [\Delta].\Gamma\{r : \text{code } [\Delta'].\Gamma'[\tau'/\alpha]\}}$$

Suppose  $\Gamma = \Gamma'' \cup \{r \rightsquigarrow \text{code } [\alpha, \Delta'].\Gamma'\}$ .

We have  $\vdash [\Delta].\Gamma \vdash = \lambda S. \exists \Delta. \vdash \Gamma \vdash S = \lambda S. \exists \Delta. \vdash \Gamma'' \vdash S \wedge \text{codeptr}(S.R(r), \exists \alpha. \Delta'.\Gamma')$  and  $\vdash \text{code } [\Delta'].\Gamma'[\tau'/\alpha] \vdash = \lambda S. \exists \Delta. \vdash \Gamma \vdash S = \lambda S. \exists \Delta. \vdash \Gamma'' \vdash S \wedge \text{codeptr}(S.R(r), \exists \Delta'.\Gamma'[\tau'/\alpha])$ .

It is easy to show  $[\forall S. \exists \Delta'. \vdash \Gamma'[\tau'/\alpha] \vdash S \rightarrow \exists \alpha. \Delta'. \vdash \Gamma' \vdash S] \vdash_{\Psi \vdash, \Phi \vdash}$ . Trivial.

**Case PACK.** The derivation has the form

$$\frac{\Gamma(r) = \tau[\tau'/\alpha] \quad \Delta \vdash \tau'}{\vdash_{\text{TAL}} [\Delta].\Gamma \leq [\Delta].\Gamma\{r : \exists \alpha. \tau\}}$$

Suppose  $\Gamma = \Gamma'' \cup \{r \rightsquigarrow \tau[\tau'/\alpha]\}$ .

We have  $\vdash [\Delta].\Gamma \vdash = \lambda S. \exists \Delta. \vdash \Gamma'' \vdash S \wedge \vdash \tau[\tau'/\alpha] \vdash S.R(r)$  and

$\vdash [\Delta].\Gamma\{r : \exists \alpha. \tau\} \vdash = \lambda S. \exists \Delta. \vdash \Gamma'' \vdash S \wedge \exists \alpha. \vdash \tau \vdash S.R(r)$ . Trivial.

**Case UNPACK.** The derivation has the form

$$\frac{\Gamma(r) = \exists \alpha. \tau}{\vdash_{\text{TAL}} [\Delta].\Gamma \leq [\alpha, \Delta].\Gamma\{r : \tau\}}$$

Suppose  $\Gamma = \Gamma'' \cup \{r \rightsquigarrow \exists \alpha. \tau\}$ .

We have  $\vdash [\Delta].\Gamma \vdash = \lambda S. \exists \Delta. \vdash \Gamma'' \vdash S \wedge \exists \alpha. \vdash \tau \vdash S.R(r)$  and

$\vdash [\alpha, \Delta].\Gamma\{r : \tau\} \vdash = \lambda S. \exists \alpha. \Delta. \vdash \Gamma'' \vdash S \wedge \vdash \tau[\tau'/\alpha] \vdash S.R(r)$ . Trivial.

**Case FOLD.** The derivation has the form

$$\frac{\Gamma(r) = \tau[\mu \alpha. \tau/\alpha]}{\vdash_{\text{TAL}} [\Delta].\Gamma \leq [\Delta].\Gamma\{r : \mu \alpha. \tau\}}$$

Suppose  $\Gamma = \Gamma'' \cup \{r \rightsquigarrow \tau[\mu \alpha. \tau/\alpha]\}$ .

We have  $\vdash [\Delta].\Gamma \vdash = \lambda S. \exists \Delta. \vdash \Gamma'' \vdash S \wedge \vdash \tau[\mu \alpha. \tau/\alpha] \vdash S.R(r)$  and

$\vdash [\Delta].\Gamma\{r : \mu \alpha. \tau\} \vdash = \lambda S. \exists \Delta. \vdash \Gamma'' \vdash S \wedge (\mu \alpha. \lambda x. \vdash \tau \vdash x \ S.R(r))$ . Trivial.

**Case UNFOLD.** The derivation has the form

$$\frac{\Gamma(r) = \mu\alpha.\tau}{\vdash_{\text{TAL}} [\Delta].\Gamma \leq [\Delta].\Gamma\{r:\tau[\mu\alpha.\tau/\alpha]\}}$$

Suppose  $\Gamma = \Gamma'' \cup \{r \rightsquigarrow \mu\alpha.\tau\}$ .

We have  $\vdash [\Delta].\Gamma^\sqsupseteq = \lambda S. \exists \Delta. \vdash \Gamma''^\sqsupseteq S \wedge (\mu\alpha.\lambda x.\vdash \tau^\sqsupseteq x S.R(r))$  and

$\vdash \Gamma\{r:\tau[\mu\alpha.\tau/\alpha]\}^\sqsupseteq = \lambda S. \exists \Delta. \vdash \Gamma''^\sqsupseteq S \wedge \vdash \tau[\mu\alpha.\tau/\alpha]^\sqsupseteq S.R(r)$ . Trivial.  $\blacksquare$

### Lemma C.6 (Instruction Typing Preservation)

If  $\Psi \vdash_{\text{TAL}} \{\Gamma\} c \{\Gamma'\}$  then  $\vdash [\Delta].\Gamma^\sqsupseteq \Rightarrow_c \vdash [\Delta].\Gamma'^\sqsupseteq$ .

Proof Sketch. By casing on derivation  $\Psi \vdash_{\text{TAL}} \{\Gamma\} c \{\Gamma'\}$ . Trivial.  $\blacksquare$

### Lemma C.7 (Instruction Sequence Typing Preservation)

If  $\Psi \vdash_{\text{TAL}} \{[\Delta].\Gamma\} \mathbb{I}$  then  $\vdash \Psi^\sqsupseteq \vdash_{\text{XCAP}} \{\vdash [\Delta].\Gamma^\sqsupseteq\} \mathbb{I}$ .

Proof Sketch. By induction over derivation  $\Psi \vdash_{\text{TAL}} \{[\Delta].\Gamma\} \mathbb{I}$ .

case WEAKEN. The derivation has the form

$$\frac{\vdash_{\text{TAL}} [\Delta].\Gamma \leq [\Delta'].\Gamma' \quad \Psi \vdash_{\text{TAL}} \{[\Delta'].\Gamma'\} \mathbb{I}}{\Psi \vdash_{\text{TAL}} \{[\Delta].\Gamma\} \mathbb{I}}$$

By induction hypodissertation it follows that  $\vdash \Psi^\sqsupseteq \vdash_{\text{XCAP}} \{\vdash [\Delta'].\Gamma'^\sqsupseteq\} \mathbb{I}$ . By Subtyping Preservation (Lemma C.5) it follows that  $\vdash [\Delta].\Gamma^\sqsupseteq \Rightarrow \vdash [\Delta'].\Gamma'^\sqsupseteq$ . The by XCAP Instruction Sequence Weakening (Lemma B.1) it follows that  $\vdash \Psi^\sqsupseteq \vdash_{\text{XCAP}} \{\vdash [\Delta].\Gamma^\sqsupseteq\} \mathbb{I}$ .

case SEQ. The derivation has the form

$$\frac{\Psi \vdash_{\text{TAL}} \{\Gamma\} c \{\Gamma'\} \quad \Psi \vdash_{\text{TAL}} \{[\Delta].\Gamma'\} \mathbb{I} \quad c \in \{\text{add, addi, mov, movi, ld, st}\}}{\Psi \vdash_{\text{TAL}} \{[\Delta].\Gamma\} c; \mathbb{I}}$$

By induction hypodissertation it follows that  $\vdash \Psi^\sqsupseteq \vdash_{\text{XCAP}} \{\vdash [\Delta].\Gamma'^\sqsupseteq\} \mathbb{I}$ . By Instruction Typing Preservation (Lemma C.6) it follows that  $\vdash [\Delta].\Gamma^\sqsupseteq \Rightarrow_c \vdash [\Delta].\Gamma'^\sqsupseteq$ . By XCAP rule SEQ it follows that  $\vdash \Psi^\sqsupseteq \vdash_{\text{XCAP}} \{\vdash [\Delta].\Gamma^\sqsupseteq\} c; \mathbb{I}$ .

case JD. The derivation has the form

$$\frac{f \in \text{dom}(\Psi) \quad \vdash_{\text{TAL}} [\Delta].\Gamma \leq \Psi(f)}{\Psi \vdash_{\text{TAL}} \{[\Delta].\Gamma\} \text{jd } f}$$

By Subtyping Preservation (Lemma C.5) it follows that  $\vdash [\Delta].\Gamma^\sqsupseteq \Rightarrow \vdash \Psi(f)^\sqsupseteq$ , and then  $\vdash [\Delta].\Gamma^\sqsupseteq \Rightarrow \vdash \Psi^\sqsupseteq(f)$ . By XCAP rule JD it follows that  $\vdash \Psi^\sqsupseteq \vdash_{\text{XCAP}} \{\vdash [\Delta].\Gamma^\sqsupseteq\} \text{jd } f$ .

case JMP. The derivation has the form

$$\frac{\Gamma(r) = \text{code } [\Delta'].\Gamma' \quad \vdash_{\text{TAL}} [\Delta].\Gamma \leq [\Delta'].\Gamma'}{\Psi \vdash_{\text{TAL}} \{[\Delta].\Gamma\} \text{jmp } r}$$

By translation it follows that  $\Gamma[\Delta].\Gamma \Rightarrow \lambda(\mathbb{H}, \mathbb{R}).\text{cptr}(\mathbb{R}(r), \Gamma[\Delta'].\Gamma')$ . By Subtyping Preservation (Lemma C.5) it follows that  $\Gamma[\Delta].\Gamma \Rightarrow \Gamma[\Delta'].\Gamma'$ . By XCAP rule JMP it follows that  $\Gamma \vdash_{\text{XCAP}} \{\Gamma[\Delta].\Gamma\} \text{jmp } r$ .

case MOVF. The derivation has the form

$$\frac{f \in \text{dom}(\Psi) \quad \Psi \vdash_{\text{TAL}} \{[\Delta].\Gamma\{r_d \rightsquigarrow \text{code } \Psi(f)\}\} \mathbb{I}}{\Psi \vdash_{\text{TAL}} \{[\Delta].\Gamma\} \text{movi } r_d, f; \mathbb{I}}$$

By induction hypodissertation it follows that  $\Gamma \vdash_{\text{XCAP}} \{\Gamma[\Delta].\Gamma\{r_d \rightsquigarrow \text{code } \Psi(f)\}\} \mathbb{I}$ . By XCAP rule ECP and SEQ it follows that  $\Gamma \vdash_{\text{XCAP}} \{\Gamma[\Delta].\Gamma\} \text{movi } r_d, f; \mathbb{I}$ . ■

### Lemma C.8 (Code Heap Typing Preservation)

If  $\Psi_{IN} \vdash_{\text{TAL}} \mathbb{C} : \Psi$  then  $\Gamma \Psi_{IN} \vdash_{\text{XCAP}} \mathbb{C} : \Gamma \Psi$ .

Proof Sketch. By induction on derivation  $\Psi_{IN} \vdash_{\text{TAL}} \mathbb{C} : \Psi$ .

case CDHP. The derivation has the form

$$\frac{\Psi_{IN} \vdash_{\text{TAL}} \{\Psi(f)\} \mathbb{C}(f) \quad \forall f \in \text{dom}(\Psi)}{\Psi_{IN} \vdash_{\text{TAL}} \mathbb{C} : \Psi}$$

From  $\Psi_{IN} \vdash_{\text{TAL}} \{\Psi(f)\} \mathbb{C}(f)$  by Instruction Sequence Typing Preservation (Lemma C.7) it follows that  $\Gamma \Psi_{IN} \vdash_{\text{XCAP}} \{\Gamma \Psi(f)\} \mathbb{C}(f)$ .

case LINK. The derivation has the form

$$\frac{\Psi_{IN1} \vdash_{\text{TAL}} \mathbb{C}_1 : \Psi_1 \quad \Psi_{IN2} \vdash_{\text{TAL}} \mathbb{C}_2 : \Psi_2 \quad \Psi_{IN1}(f) = \Psi_{IN2}(f) \quad \text{dom}(\mathbb{C}_1) \cap \text{dom}(\mathbb{C}_2) = \emptyset \quad \forall f \in \text{dom}(\Psi_{IN1}) \cap \text{dom}(\Psi_{IN2})}{\Psi_{IN1} \cup \Psi_{IN2} \vdash_{\text{TAL}} \mathbb{C}_1 \cup \mathbb{C}_2 : \Psi_1 \cup \Psi_2}$$

From  $\Psi_{INi} \vdash_{\text{TAL}} \mathbb{C}_i : \Psi_i$  by Instruction Sequence Typing Preservation (Lemma C.7) it follows that  $\Gamma \Psi_{INi} \vdash_{\text{XCAP}} \mathbb{C}_i : \Gamma \Psi_i$ . By XCAP rule LINK it follows that

$$\Gamma \Psi_{IN1} \cup \Gamma \Psi_{IN2} \vdash_{\text{XCAP}} \mathbb{C}_1 \cup \mathbb{C}_2 : \Gamma \Psi_1 \cup \Gamma \Psi_2$$
■

### Theorem C.9 (Program Typing Preservation)

If  $\Psi \vdash_{\text{TAL}} \{[\Delta].\Gamma\} \mathbb{P}$  then  $\Gamma \vdash_{\text{XCAP}} \{\Gamma[\Delta].\Gamma\} \mathbb{P}$ .

Proof Sketch. Derivation  $\Psi \vdash_{\text{TAL}} \{[\Delta].\Gamma\} \mathbb{P}$  has the form

$$\frac{\Psi_G \vdash_{\text{TAL}} \mathbb{C} : \Psi_G \quad \Psi_G \vdash_{\text{TAL}} \mathbb{S} : [\Delta].\Gamma \quad \Psi_G \vdash_{\text{TAL}} \{[\Delta].\Gamma\}\mathbb{I}}{\Psi_G \vdash_{\text{TAL}} \{[\Delta].\Gamma\}(\mathbb{C}, \mathbb{S}, \mathbb{I})}$$

From  $\Psi_G \vdash_{\text{TAL}} \mathbb{C} : \Psi_G$  by Code Heap Typing Preservation (Lemma C.8) it follows that

$$\ulcorner \Psi_G \urcorner \vdash_{\text{XCAP}} \mathbb{C} : \urcorner \Psi_G \urcorner.$$

From  $\Psi_G \vdash_{\text{TAL}} \mathbb{S} : [\Delta].\Gamma$  by State Typing Preservation (Lemma C.4) it follows that

$$[\ulcorner \Gamma \urcorner] \ulcorner \Psi_G \urcorner \vdash \mathbb{S}.$$

From  $\Psi_G \vdash_{\text{TAL}} \{[\Delta].\Gamma\}\mathbb{I}$  by Instruction Sequence Typing Preservation (Lemma C.7) it

follows that  $\Psi_G \vdash_{\text{XCAP}} \{[\Delta].\Gamma\}\mathbb{I}$ . ■



## Appendix D

# XCAP86 Soundness Proof

Following the soundness proof of the XCAP in Appendix B, we present the soundness proof of XCAP86 discussed in Chapter 6.

### Lemma D.1 (XCAP86 Instruction Sequence Weakening)

If  $\Psi \vdash \{a\} \mathbb{I}$ ,  $\Psi \subseteq \Psi'$ , and  $a' \Rightarrow a$  then  $\Psi' \vdash \{a'\} \mathbb{I}$ .

**Proof Sketch.** The proof is by induction over the derivation  $\Psi \vdash \{a\} \mathbb{I}$ , named as  $\mathcal{D}$ .

**Case SEQ.** The derivation  $\mathcal{D}$  has the form

$$\frac{a \Rightarrow_C a'' \quad \Psi \vdash \{a''\} \mathbb{I}'}{\Psi \vdash \{a\} c; \mathbb{I}'}$$

We have

1. from  $a \Rightarrow a'$  and  $a \Rightarrow_C a''$  it follows that  $a' \Rightarrow_C a''$ ;
2. from  $\Psi \vdash \{a''\} \mathbb{I}'$  by the induction hypothesis it follows that  $\Psi' \vdash \{a''\} \mathbb{I}'$ .

**Case JMPI.** The derivation  $\mathcal{D}$  has the form

$$\frac{a \Rightarrow \Psi(f) \quad f \in \text{dom}(\Psi)}{\Psi \vdash \{a\} \text{jmp } f}$$

From  $a \Rightarrow a'$  and  $a \Rightarrow \Psi(f)$  it follows that  $a' \Rightarrow \Psi(f)$ .

**Case JCC.** The derivation  $\mathcal{D}$  has the form

$$\frac{\begin{array}{c} (\lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \langle \neg \hat{F}(cc) \rangle \wedge a(\mathbb{H}, \mathbb{R}, \mathbb{F})) \Rightarrow a'' \quad \Psi \vdash \{a''\} \mathbb{I}' \\ (\lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \langle \hat{F}(cc) \rangle \wedge a(\mathbb{H}, \mathbb{R}, \mathbb{F})) \Rightarrow \Psi(f) \quad f \in \text{dom}(\Psi) \end{array}}{\Psi \vdash \{a\} jcc f; \mathbb{I}'}$$

We have

1. from  $a \Rightarrow a'$  and  $(\lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \langle \neg \hat{F}(cc) \rangle \wedge a(\mathbb{H}, \mathbb{R}, \mathbb{F})) \Rightarrow a''$  it follows that  
 $(\lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \langle \neg \hat{F}(cc) \rangle \wedge a'(\mathbb{H}, \mathbb{R}, \mathbb{F})) \Rightarrow a'';$
2. from  $\Psi \vdash \{a''\} \mathbb{I}'$  and the induction hypodissertation it follows that  $\Psi' \vdash \{a''\} \mathbb{I}'$ ;
3. from  $a \Rightarrow a'$  and  $(\lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \langle \hat{F}(cc) \rangle \wedge a(\mathbb{H}, \mathbb{R}, \mathbb{F})) \Rightarrow \Psi(f)$  it follows that  
 $(\lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \langle \hat{F}(cc) \rangle \wedge a'(\mathbb{H}, \mathbb{R}, \mathbb{F})) \Rightarrow \Psi(f).$

**Case JMPR.** The derivation  $\mathcal{D}$  has the form

$$\frac{a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \text{cptr}(\mathbb{R}(r), a'') \quad a \Rightarrow a''}{\Psi \vdash \{a\} \text{jmp } r}$$

From  $a \Rightarrow a'$ ,  $a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \text{cptr}(\mathbb{R}(r), a')$ , and  $a \Rightarrow a''$  it follows that

$a' \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \text{cptr}(\mathbb{R}(r), a')$ , and  $a' \Rightarrow a''$ .

**Case ECP.** The derivation  $\mathcal{D}$  has the form

$$\frac{(\lambda S. \text{cptr}(f, \Psi(f)) \wedge a S) \Rightarrow a'' \quad f \in \text{dom}(\Psi) \quad \Psi \vdash \{a''\} \mathbb{I}}{\Psi \vdash \{a\} \mathbb{I}}$$

We have

1. from  $a \Rightarrow a'$  and  $(\lambda S. \text{cptr}(f, \Psi(f)) \wedge a S) \Rightarrow a''$  it follows that  
 $(\lambda S. \text{cptr}(f, \Psi(f)) \wedge a' S) \Rightarrow a'';$
2. from  $\Psi \vdash \{a''\} \mathbb{I}$  by the induction hypodissertation it follows that  $\Psi' \vdash \{a''\} \mathbb{I}$ .

**Case CALLI.** The derivation  $\mathcal{D}$  has the form

$$\frac{a \Rightarrow_{\text{push } f_{\text{ret}}} \Psi(f) \quad f \in \text{dom}(\Psi)}{\Psi \vdash \{a\} \text{ call } f; [f_{\text{ret}}]}$$

From  $a \Rightarrow a'$  and  $a \Rightarrow_{\text{push } f_{\text{ret}}} \Psi(f)$ , it follows that  $a' \Rightarrow_{\text{push } f_{\text{ret}}} \Psi(f)$ .

**Case CALLR.** The derivation  $\mathcal{D}$  has the form

$$\frac{a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \text{cptr}(\mathbb{R}(r), a'') \quad a \Rightarrow_{\text{push}} f_{\text{ret}} a''}{\Psi \vdash \{a\} \text{call } r; [f_{\text{ret}}]}$$

From  $a \Rightarrow a'$ ,  $a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \text{cptr}(\mathbb{R}(r), a'')$ , and  $a \Rightarrow_{\text{push}} f_{\text{ret}} a''$ , it follows that  $a' \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \text{cptr}(\mathbb{R}(r), a'')$ , and  $a' \Rightarrow_{\text{push}} f_{\text{ret}} a''$ .

**Case RET.** The derivation  $\mathcal{D}$  has the form

$$\frac{a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \text{cptr}(\mathbb{H}(\mathbb{R}(\text{esp})), a'') \quad a \Rightarrow_{\text{pop}} a''}{\Psi \vdash \{a\} \text{ret}}$$

From  $a \Rightarrow a'$ ,  $a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \text{cptr}(\mathbb{H}(\mathbb{R}(\text{esp})), a'')$ , and  $a \Rightarrow_{\text{pop}} a''$ , it follows that  $a' \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \text{cptr}(\mathbb{H}(\mathbb{R}(\text{esp})), a'')$ , and  $a' \Rightarrow_{\text{pop}} a''$ . ■

### Lemma D.2 (XCAP86 Code Heap Typing)

If  $\Psi_{IN} \vdash \mathbb{C} : \Psi$  and  $f \in \text{dom}(\Psi)$  then  $f \in \text{dom}(\mathbb{C})$  and  $\Psi_{IN} \vdash \{\Psi(f)\} \mathbb{C}(f)$ .

**Proof Sketch.** The proof is same as the XCAP Code Heap Typing (Lemma B.2). ■

### Lemma D.3 (XCAP86 State Typing)

If  $\Psi_{IN} \vdash \mathbb{C} : \Psi$  and  $\llbracket \text{cptr}(f, a) \rrbracket_{\Psi}$  then  $f \in \text{dom}(\mathbb{C})$  and  $\Psi_{IN} \vdash \{a\} \mathbb{C}(f)$ .

**Proof Sketch.** The proof is same as the XCAP State Typing (Lemma B.3). ■

### Lemma D.4 (XCAP86 Progress)

If  $\Psi_G \vdash \{a\} \mathbb{P}$ , then there exists a program  $\mathbb{P}'$  such that  $\mathbb{P} \longmapsto \mathbb{P}'$ .

**Proof Sketch.** Derivation  $\Psi_G \vdash \{a\} \mathbb{P}$  has the following form.

$$\frac{\Psi_G \vdash \mathbb{C} : \Psi_G \quad ((\text{DC}(\mathbb{C}) * \llbracket a \rrbracket_{\Psi_G}) \ \mathbb{S}) \quad \text{lookup}(\mathbb{C}, pc, \mathbb{I}) \quad \Psi_G \vdash \{a\} \mathbb{I}}{\Psi_G \vdash \{a\} (\mathbb{S}, pc)}$$

The proof is by induction over derivation  $\Psi_G \vdash \{a\} \mathbb{I}$ .

**Case SEQ, JMPI, JCC, JMPR, CALLI, CALLR, and RET.** the proof is by simple inspections.

**Case ECP.** The proof is by the induction hypodissertation. ■

### Lemma D.5 (XCAP86 Preservation)

If  $\Psi_G \vdash \{a\} \mathbb{P}$  and  $\mathbb{P} \longmapsto \mathbb{P}'$  then there exists an assertion  $a'$  such that  $\Psi_G \vdash \{a'\} \mathbb{P}'$ .

**Proof Sketch.** Suppose  $\mathbb{P} = ((\mathbb{H}, \mathbb{R}, \mathbb{F}), pc)$ . We name derivation  $\Psi_G \vdash \{a\} \mathbb{P}$  as  $\mathcal{D}$ , which has the following form.

$$\frac{\Psi_G \vdash \mathbb{C} : \Psi_G \quad ((\mathbf{DC}(\mathbb{C}) * [\![\mathbf{a}]\!]_{\Psi_G}) \ (\mathbb{H}, \mathbb{R}, \mathbb{F})) \quad \mathbf{lookup}(\mathbb{C}, pc, \mathbb{I}) \quad \Psi_G \vdash \{\mathbf{a}\} \mathbb{I}}{\Psi_G \vdash \{\mathbf{a}\} ((\mathbb{H}, \mathbb{R}, \mathbb{F}), pc)}$$

The proof is by induction over the derivation  $\Psi_G \vdash \{\mathbf{a}\} \mathbb{I}$ , named as  $\mathcal{E}$ .

**Case SEQ.** The derivation  $\mathcal{E}$  has the form

$$\frac{\mathbf{a} \Rightarrow_{\mathbf{C}} \mathbf{a}' \quad \Psi_G \vdash \{\mathbf{a}'\} \mathbb{I}'}{\Psi_G \vdash \{\mathbf{a}\} \mathbf{c}; \mathbb{I}'}$$

By the operational semantics,  $\mathbb{P}' = (\mathbf{Next}_{\mathbf{C}}(\mathbb{H}, \mathbb{R}, \mathbb{F}), npc)$ . Then

1.  $\Psi_G \vdash \mathbb{C} : \Psi_G$  is in  $\mathcal{D}$ ;
2. from  $((\mathbf{DC}(\mathbb{C}) * [\![\mathbf{a}]\!]_{\Psi_G}) \ (\mathbb{H}, \mathbb{R}, \mathbb{F}))$  and  $\mathbf{a} \Rightarrow_{\mathbf{C}} \mathbf{a}'$  it follows that  $((\mathbf{DC}(\mathbb{C}) * [\![\mathbf{a}']\!]_{\Psi_G}) \ \mathbf{Next}_{\mathbf{C}}(\mathbb{H}, \mathbb{R}, \mathbb{F}))$ ;
3. from  $\mathbf{lookup}(\mathbb{C}, pc, \mathbb{I})$  and  $\mathbf{Dc}(\mathbb{H}, pc) = (\mathbf{c}, npc)$  it follows  $\mathbf{lookup}(\mathbb{C}, npc, \mathbb{I}')$ ;
4.  $\Psi_G \vdash \{\mathbf{a}'\} \mathbb{I}'$  is in  $\mathcal{E}$ .

**Case JMPI.** The derivation  $\mathcal{E}$  has the form

$$\frac{\mathbf{a} \Rightarrow \Psi_G(\mathbf{f}) \quad \mathbf{f} \in \mathbf{dom}(\Psi_G)}{\Psi_G \vdash \{\mathbf{a}\} \mathbf{jmp} \mathbf{f}}$$

By the operational semantics,  $\mathbb{P}' = ((\mathbb{H}, \mathbb{R}, \mathbb{F}), \mathbf{f})$ . Then

1.  $\Psi_G \vdash \mathbb{C} : \Psi_G$  is in  $\mathcal{D}$ ;
2. from  $((\mathbf{DC}(\mathbb{C}) * [\![\mathbf{a}]\!]_{\Psi_G}) \ (\mathbb{H}, \mathbb{R}, \mathbb{F}))$  and  $\mathbf{a} \Rightarrow \Psi_G(\mathbf{f})$  it follows that  $((\mathbf{DC}(\mathbb{C}) * [\![\Psi_G(\mathbf{f})]\!]_{\Psi_G}) \ (\mathbb{H}, \mathbb{R}, \mathbb{F}))$ ;
3. from  $\mathbf{f} \in \mathbf{dom}(\Psi_G)$  and  $\Psi_G \vdash \mathbb{C} : \Psi_G$  by Code Heap Typing (Lemma D.2) it follows that  $\mathbf{f} \in \mathbf{dom}(\mathbb{C})$ , and it follows that  $\mathbf{lookup}(\mathbb{C}, \mathbf{f}, \mathbb{C}(\mathbf{f}))$ ;
4. from  $\mathbf{f} \in \mathbf{dom}(\Psi_G)$  and  $\Psi_G \vdash \mathbb{C} : \Psi_G$  by Code Heap Typing (Lemma D.2) it follows that  $\Psi_G \vdash \{\Psi_G(\mathbf{f})\} \mathbb{C}(\mathbf{f})$ .

**Case JCC.** The derivation  $\mathcal{E}$  has the form

$$\begin{array}{c}
 (\lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \langle \neg \hat{\mathbb{F}}(cc) \rangle \wedge a(\mathbb{H}, \mathbb{R}, \mathbb{F})) \Rightarrow a' \quad \Psi_G \vdash \{a'\} \mathbb{I}' \\
 (\lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \langle \hat{\mathbb{F}}(cc) \rangle \wedge a(\mathbb{H}, \mathbb{R}, \mathbb{F})) \Rightarrow \Psi_G(f) \quad f \in \text{dom}(\Psi_G) \\
 \hline
 \Psi_G \vdash \{a\} jcc f; \mathbb{I}' 
 \end{array}$$

By the operational semantics, when  $\hat{\mathbb{F}}(cc), \mathbb{P}' = ((\mathbb{H}, \mathbb{R}, \mathbb{F}), f)$ . Then

1.  $\Psi_G \vdash \mathbb{C} : \Psi_G$  is in  $\mathcal{D}$ ;
2. from  $((\text{DC}(\mathbb{C}) * \llbracket a \rrbracket_{\Psi_G}) (\mathbb{H}, \mathbb{R}, \mathbb{F}))$  and  $(\lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \langle \hat{\mathbb{F}}(cc) \rangle \wedge a(\mathbb{H}, \mathbb{R}, \mathbb{F})) \Rightarrow \Psi_G(f)$  it follows that  $((\text{DC}(\mathbb{C}) * \llbracket \Psi_G(f) \rrbracket_{\Psi_G}) (\mathbb{H}, \mathbb{R}, \mathbb{F}))$ ;
3. from  $f \in \text{dom}(\Psi_G)$  and  $\Psi_G \vdash \mathbb{C} : \Psi_G$  by Code Heap Typing (Lemma D.2) it follows that  $f \in \text{dom}(\mathbb{C})$ , and it follows that  $\text{lookup}(\mathbb{C}, f, \mathbb{C}(f))$ ;
4. from  $f \in \text{dom}(\Psi_G)$  and  $\Psi_G \vdash \mathbb{C} : \Psi_G$  by Code Heap Typing (Lemma D.2) it follows that  $\Psi_G \vdash \{\Psi_G(f)\} \mathbb{C}(f)$ .

By the operational semantics, when  $\neg \hat{\mathbb{F}}(cc), \mathbb{P}' = ((\mathbb{H}, \mathbb{R}, \mathbb{F}), \text{npc})$ . Then

1.  $\Psi_G \vdash \mathbb{C} : \Psi_G$  is in  $\mathcal{D}$ ;
2. from  $((\text{DC}(\mathbb{C}) * \llbracket a \rrbracket_{\Psi_G}) (\mathbb{H}, \mathbb{R}, \mathbb{F}))$  and  $(\lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \langle \neg \hat{\mathbb{F}}(cc) \rangle \wedge a(\mathbb{H}, \mathbb{R}, \mathbb{F})) \Rightarrow a'$  it follows that  $((\text{DC}(\mathbb{C}) * \llbracket a' \rrbracket_{\Psi_G}) (\mathbb{H}, \mathbb{R}, \mathbb{F}))$ ;
3. from  $\text{lookup}(\mathbb{C}, pc, \mathbb{I})$  and  $\text{Dc}(\mathbb{H}, pc) = (c, \text{npc})$  it follows  $\text{lookup}(\mathbb{C}, \text{npc}, \mathbb{I}')$ ;
4.  $\Psi_G \vdash \{a'\} \mathbb{I}'$  is in  $\mathcal{E}$ .

Case JMPR. The derivation  $\mathcal{E}$  has the form

$$\frac{a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \text{cptr}(\mathbb{R}(r), a') \quad a \Rightarrow a'}{\Psi_G \vdash \{a\} \text{jmp } r}$$

From  $((\text{DC}(\mathbb{C}) * \llbracket a \rrbracket_{\Psi_G}) (\mathbb{H}, \mathbb{R}, \mathbb{F}))$ ,  $a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \text{cptr}(\mathbb{R}(r), a')$  and  $a \Rightarrow a'$  it follows that  $((\text{DC}(\mathbb{C}) * \llbracket a' \rrbracket_{\Psi_G}) (\mathbb{H}, \mathbb{R}, \mathbb{F}))$  and  $\llbracket \text{cptr}(\mathbb{R}(r), a') \rrbracket_{\Psi_G}$ .

By the operational semantics,  $\mathbb{P}' = ((\mathbb{H}, \mathbb{R}, \mathbb{F}), \mathbb{R}(r))$ . Then

1.  $\Psi_G \vdash \mathbb{C} : \Psi_G$  is in  $\mathcal{D}$ ;

2.  $((DC(C) * [[a']]_{\Psi_G}) (\mathbb{H}, \mathbb{R}, \mathbb{F}))$  is from above;
3. by  $\Psi_G \vdash C : \Psi_G, [[cptr(\mathbb{R}(r), a')]_{\Psi_G}}$ , and State Typing (Lemma D.3) it follows that  $\mathbb{R}(r) \in \text{dom}(C)$ , and it follows that  $\text{lookup}(C, \mathbb{R}(r), C(\mathbb{R}(r)))$ ;
4. by  $\Psi_G \vdash C : \Psi_G, [[cptr(\mathbb{R}(r), a')]_{\Psi_G}}$ , and State Typing (Lemma D.3) it follows that  $\Psi_G \vdash \{a'\} \mathbb{R}(r)$ .

Case ECP. The derivation  $\mathcal{E}$  has the form

$$\frac{(\lambda S. \text{cptr}(f, \Psi_G(f)) \wedge a S) \Rightarrow a' \quad f \in \text{dom}(\Psi_G) \quad \Psi_G \vdash \{a'\} \mathbb{I}}{\Psi_G \vdash \{a\} \mathbb{I}}$$

From  $((DC(C) * [[a]]_{\Psi_G}) (\mathbb{H}, \mathbb{R}, \mathbb{F}))$ ,  $(\lambda S. \text{cptr}(f, \Psi_G(f)) \wedge a S) \Rightarrow a'$ , and  $f \in \text{dom}(\Psi_G)$  it follows that  $((DC(C) * [[a']]_{\Psi_G}) (\mathbb{H}, \mathbb{R}, \mathbb{F}))$ . Together with  $\Psi_G \vdash \{a'\} \mathbb{I}$ , the prove is completed by using the induction hypodissertation.

Case CALLI. The derivation  $\mathcal{E}$  has the form

$$\frac{a \Rightarrow_{\text{push}} f_{\text{ret}} \Psi_G(f) \quad f \in \text{dom}(\Psi_G)}{\Psi_G \vdash \{a\} \text{call } f; [f_{\text{ret}}]}$$

By the operational semantics,  $\mathbb{P}' = (\text{Next}_{\text{push}} f_{\text{ret}} (\mathbb{H}, \mathbb{R}, \mathbb{F}), f)$ . Then

1.  $\Psi_G \vdash C : \Psi_G$  is in  $\mathcal{D}$ ;
2. from  $((DC(C) * [[a]]_{\Psi_G}) (\mathbb{H}, \mathbb{R}, \mathbb{F}))$  and  $a \Rightarrow_{\text{push}} f_{\text{ret}} \Psi_G(f)$  it follows that  $((DC(C) * [[\Psi_G(f)]_{\Psi_G}]_{\Psi_G}) \text{Next}_{\text{push}} f_{\text{ret}} (\mathbb{H}, \mathbb{R}, \mathbb{F}))$ ;
3. from  $f \in \text{dom}(\Psi_G)$  and  $\Psi_G \vdash C : \Psi_G$  by Code Heap Typing (Lemma D.2) it follows that  $f \in \text{dom}(C)$ , and it follows that  $\text{lookup}(C, f, C(f))$ ;
4. from  $f \in \text{dom}(\Psi_G)$  and  $\Psi_G \vdash C : \Psi_G$  by Code Heap Typing (Lemma D.2) it follows that  $\Psi_G \vdash \{\Psi_G(f)\} C(f)$ .

Case CALLR. The derivation  $\mathcal{E}$  has the form

$$\frac{a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}). \text{cptr}(\mathbb{R}(r), a') \quad a \Rightarrow_{\text{push}} f_{\text{ret}} a'}{\Psi_G \vdash \{a\} \text{call } r; [f_{\text{ret}}]}$$

From  $((DC(\mathbb{C}) * [\![a]\!]_{\Psi_G}) (\mathbb{H}, \mathbb{R}, \mathbb{F}))$ ,  $a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}).cptr(\mathbb{R}(r), a')$  and  $a \Rightarrow_{push f_{ret}} a'$  it follows that  $[\![cptr(\mathbb{R}(r), a')]\!]_{\Psi_G}$  and  $((DC(\mathbb{C}) * [\![a']]\!]_{\Psi_G}) \text{Next}_{push f_{ret}} (\mathbb{H}, \mathbb{R}, \mathbb{F})$ .

By the operational semantics,  $\mathbb{P}' = (\text{Next}_{push f_{ret}} (\mathbb{H}, \mathbb{R}, \mathbb{F}), \mathbb{R}(r))$ . Then

1.  $\Psi_G \vdash \mathbb{C} : \Psi_G$  is in  $\mathcal{D}$ ;
2.  $((DC(\mathbb{C}) * [\![a']]\!]_{\Psi_G}) \text{Next}_{push f_{ret}} (\mathbb{H}, \mathbb{R}, \mathbb{F})$  is from above;
3. by  $\Psi_G \vdash \mathbb{C} : \Psi_G$ ,  $[\![cptr(\mathbb{R}(r), a')]\!]_{\Psi_G}$ , and State Typing (Lemma D.3) it follows that  $\mathbb{R}(r) \in \text{dom}(\mathbb{C})$ , and it follows that  $\text{lookup}(\mathbb{C}, \mathbb{R}(r), \mathbb{C}(\mathbb{R}(r)))$ ;
4. by  $\Psi_G \vdash \mathbb{C} : \Psi_G$ ,  $[\![cptr(\mathbb{R}(r), a')]\!]_{\Psi_G}$ , and State Typing (Lemma D.3) it follows that  $\Psi_G \vdash \{a'\} \mathbb{C}(\mathbb{R}(r))$ .

Case RET. The derivation  $\mathcal{E}$  has the form

$$\frac{a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}).cptr(\mathbb{H}(\mathbb{R}(\mathbb{R}(\mathbb{R}))) a') \quad a \Rightarrow_{pop} a'}{\Psi \vdash \{a\} \text{ret}}$$

From  $((DC(\mathbb{C}) * [\![a]\!]_{\Psi_G}) (\mathbb{H}, \mathbb{R}, \mathbb{F}))$ ,  $a \Rightarrow \lambda(\mathbb{H}, \mathbb{R}, \mathbb{F}).cptr(\mathbb{H}(\mathbb{R}(\mathbb{R}))) a'$  and  $a \Rightarrow_{pop} a'$  it follows that  $[\![cptr(\mathbb{H}(\mathbb{R}(\mathbb{R})), a')]\!]_{\Psi_G}$  and  $((DC(\mathbb{C}) * [\![a']]\!]_{\Psi_G}) \text{Next}_{pop} (\mathbb{H}, \mathbb{R}, \mathbb{F})$ .

By the operational semantics,  $\mathbb{P}' = (\text{Next}_{pop} (\mathbb{H}, \mathbb{R}, \mathbb{F}), \mathbb{H}(\mathbb{R}(\mathbb{R})))$ . Then

1.  $\Psi_G \vdash \mathbb{C} : \Psi_G$  is in  $\mathcal{D}$ ;
2.  $((DC(\mathbb{C}) * [\![a']]\!]_{\Psi_G}) (\mathbb{H}, \mathbb{R}, \mathbb{F})$  is from above;
3. by  $\Psi_G \vdash \mathbb{C} : \Psi_G$ ,  $[\![cptr(\mathbb{H}(\mathbb{R}(\mathbb{R})), a')]\!]_{\Psi_G}$ , and State Typing (Lemma D.3) it follows that  $\mathbb{H}(\mathbb{R}(\mathbb{R})) \in \text{dom}(\mathbb{C})$ , and it follows that  $\text{lookup}(\mathbb{C}, \mathbb{H}(\mathbb{R}(\mathbb{R})), \mathbb{C}(\mathbb{H}(\mathbb{R}(\mathbb{R}))))$ ;
4. by  $\Psi_G \vdash \mathbb{C} : \Psi_G$ ,  $[\![cptr(\mathbb{H}(\mathbb{R}(\mathbb{R})), a')]\!]_{\Psi_G}$ , and State Typing (Lemma D.3) it follows that  $\Psi_G \vdash \{a'\} \mathbb{C}(\mathbb{H}(\mathbb{R}(\mathbb{R})))$ . ■

### Theorem D.6 (XCAP86 Soundness)

If  $\Psi_G \vdash \{a\} \mathbb{P}$ , then for all natural number  $n$ , there exists a program  $\mathbb{P}'$  such that  $\mathbb{P} \xrightarrow{n} \mathbb{P}'$ .

Proof Sketch. By simple induction on  $n$  and using Progress (Lemma D.4) and Perservation (Lemma D.5). ■



# Bibliography

- [1] A. Ahmed and D. Walker. The logical approach to stack typing. In *Proceedings of the 2003 ACM SIGPLAN international workshop on Types in languages design and implementation*, pages 74–85. ACM Press, 2003.
- [2] A. J. Ahmed. *Semantics of Types for Mutable State*. PhD thesis, Princeton University, 2004.
- [3] A. J. Ahmed. Mutable fields in a semantic model of types. Talk presented at 2000 PCC Workshop, June 2000.
- [4] A. J. Ahmed, L. Jia, and D. Walker. Reasoning about hierarchical storage. In *Proc. 18th IEEE Symposium on Logic in Computer Science*, pages 33–44, June 2003.
- [5] A. W. Appel. Foundational proof-carrying code. In *Proc. 16th Annual IEEE Symposium on Logic in Computer Science*, pages 247–258, June 2001.
- [6] A. W. Appel and A. P. Felty. A semantic model of types and machine instructions for proof-carrying code. In *Proc. 27th ACM Symposium on Principles of Programming Languages*, pages 243–253, Jan. 2000.
- [7] A. W. Appel and T. Jim. Continuation-passing, closuring-passing style. In *Proc. 16th ACM Symposium on Principles of Programming Languages*, pages 293–302, Austin, Texas, USA, Jan. 1989. ACM Press.
- [8] A. W. Appel and D. McAllester. An indexed model of recursive types for founda-

- tional proof-carrying code. Technical Report CS-TR-629-00, Princeton University, Dept. of Computer Science, Nov. 2000. To appear in TOPLAS.
- [9] A. W. Appel and D. McAllester. An indexed model of recursive types for foundational proof-carrying code. *ACM Transactions on Programming Languages and Systems*, 23(5):657–683, Sept. 2001.
  - [10] A. W. Appel, P.-A. Mellies, C. D. Richards, and J. Vouillon. A very modal model of a modern, major, general type system. In *Proc. 34th ACM Symposium on Principles of Programming Languages*, Jan. 2007.
  - [11] J. Chen, D. Wu, A. W. Appel, and H. Fang. A provably sound tal for back-end optimization. In *Proc. 2003 ACM Conference on Programming Language Design and Implementation*, pages 208–219. ACM Press, 2003.
  - [12] C. Colby, P. Lee, G. Necula, F. Blau, M. Plesko, and K. Cline. A certifying compiler for Java. In *Proc. 2000 ACM Conference on Programming Language Design and Implementation*, pages 95–107, New York, 2000. ACM Press.
  - [13] K. Crary. Toward a foundational typed assembly language. In *Proc. 30th ACM Symposium on Principles of Programming Languages*, page 198, Jan. 2003.
  - [14] K. Crary and J. C. Vanderwaart. An expressive, scalable type theory for certified code. In *Proc. 7th ACM SIGPLAN International Conference on Functional Programming*, pages 191–205, 2002.
  - [15] N. G. de Bruijn. Lambda calculus notation with nameless dummies. *Indagationes Mathematicae*, 34:381–392, 1972.
  - [16] R. DeLine and M. Fähndrich. Enforcing high-level protocols in low-level software. In *Proc. 2001 ACM Conference on Programming Language Design and Implementation*, pages 59–69, New York, 2001. ACM Press.

- [17] R. S. Engelschall. GNU Pth - the GNU portable threads. <http://www.gnu.org/software/pth/>, 1999-2003.
- [18] A. Felty. Semantic models of types and machine instructions for proof-carrying code. Talk presented at 2000 PCC Workshop, June 2000.
- [19] X. Feng, Z. Ni, Z. Shao, and Y. Guo. An open framework for foundational proof-carrying code. In *Proc. Workshop on Types in Language Design and Implementation*, Jan. 2007.
- [20] X. Feng and Z. Shao. Modular verification of concurrent assembly code with dynamic thread creation and termination. In *Proc. 2005 International Conference on Functional Programming*, pages 254–267, Sept. 2005.
- [21] X. Feng, Z. Shao, A. Vaynberg, S. Xiang, and Z. Ni. Modular verification of assembly code with stack-based control abstractions. In *Proc. 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'06)*, pages 401–414, New York, NY, USA, June 2006. ACM Press.
- [22] R. W. Floyd. Assigning meaning to programs. *Communications of the ACM*, Oct. 1967.
- [23] M. Gargano, M. Hillebrand, D. Leinenbach, and W. Paul. On the correctness of operating system kernels. In *Proc. 18th International Conference on Theorem Proving in Higher-Order Logics*, pages 2–16. Springer-Verlag, 2005.
- [24] N. Glew and G. Morrisett. Type-safe linking and modular assembly language. In *Proc. 26th ACM Symposium on Principles of Programming Languages*, pages 250–261, Jan. 1999.
- [25] M. Gordon. A mechanized Hoare logic of state transitions. In A. W. Roscoe, editor, *A Classical Mind—Essays in Honour of C.A.R. Hoare*, pages 143–160. Prentice Hall, 1994.
- [26] J. Gosling, B. Joy, and G. Steele. *The Java Language Specification*. Addison-Wesley, 1996.

- [27] J. Gosling, B. Joy, G. Steele, and G. Bracha. *The Java Language Specification, Second Edition*. Addison Wesley, 2000.
- [28] N. A. Hamid and Z. Shao. Interfacing hoare logic and type systems for foundational proof-carrying code. In *Proc. 17th International Conference on the Applications of Higher Order Logic Theorem Proving*, pages 118–135. Springer-Verlag, September 2004.
- [29] N. A. Hamid, Z. Shao, V. Trifonov, S. Monnier, and Z. Ni. A syntactic approach to foundational proof-carrying code. In *Proc. 17th Annual IEEE Symposium on Logic in Computer Science*, pages 89–100, July 2002.
- [30] H. Herbelin, F. Kirchner, B. Monate, and J. Narboux. Faq about coq. <http://pauillac.inria.fr/coq/doc/faq.html#htoc38>.
- [31] C. A. R. Hoare. An axiomatic basis for computer programming. *Communications of the ACM*, Oct. 1969.
- [32] T. Hoare. The verifying compiler: A grand challenge for computing research. In *Proc. 2003 International Conference on Compiler Construction (CC'03), Lecture Notes in Computer Science, Volume 2622*, pages 262–272, Warsaw, Poland, Apr. 2003. Springer-Verlag Heidelberg.
- [33] G. C. Hunt, J. R. Larus, M. Abadi, M. Aiken, P. Barham, M. Fahndrich, C. Hawblitzel, O. Hodson, S. Levi, N. Murphy, B. Steensgaard, D. Tarditi, T. Wobber, and B. Zill. An overview of the Singularity project. Technical Report MSR-TR-2005-135, Microsoft Research, Redmond, WA, Oct. 2005.
- [34] Intel Corporation. *Intel Architecture Software Developer's Manual*, volume 1-3. Intel Corporation, 1997.
- [35] T. Jim, G. Morrisett, D. Grossman, M. Hicks, J. Cheney, and Y. Wang. Cyclone: A safe dialect of C. In *USENIX Annual Technical Conference*, June 2002.

- [36] C. League, Z. Shao, and V. Trifonov. Precision in practice: A type-preserving Java compiler. Technical Report YALEU/DCS/TR-1223, Dept. of Computer Science, Yale University, New Haven, CT, Jan. 2002.
- [37] Microsoft Corp., *et al.* Common language infrastructure. Drafts of the ECMA TC39/TG3 standardization process., 2001.
- [38] R. Milner, M. Tofte, R. Harper, and D. MacQueen. *The Definition of Standard ML (Revised)*. MIT Press, Cambridge, Massachusetts, 1997.
- [39] Y. Minamide, G. Morrisett, and R. Harper. Typed closure conversion. In *Proc. 23rd ACM Symposium on Principles of Programming Languages*, pages 271–283. ACM Press, 1996.
- [40] G. Morrisett, K. Crary, N. Glew, and D. Walker. Stack-based typed assembly language. In X. Leroy and A. Ohori, editors, *Proc. 1998 International Workshop on Types in Compilation: LNCS Vol 1473*, pages 28–52, Kyoto, Japan, March 1998. Springer-Verlag.
- [41] G. Morrisett, D. Walker, K. Crary, and N. Glew. From System F to typed assembly language. In *Proc. 25th ACM Symposium on Principles of Programming Languages*, pages 85–97. ACM Press, Jan. 1998.
- [42] D. A. Naumann. Predicate transformer semantics of a higher-order imperative language with record subtyping. *Science of Computer Programming*, 41(1):1–51, 2001.
- [43] G. Necula. Proof-carrying code. In *Proc. 24th ACM Symposium on Principles of Programming Languages*, pages 106–119, New York, Jan. 1997. ACM Press.
- [44] G. Necula. *Compiling with Proofs*. PhD thesis, School of Computer Science, Carnegie Mellon Univ., Sept. 1998.
- [45] G. Necula and P. Lee. Safe kernel extensions without run-time checking. In *Proc. 2nd USENIX Symp. on Operating System Design and Impl.*, pages 229–243, 1996.

- [46] G. C. Necula, S. McPeak, and W. Weimer. CCured: Type-safe retrofitting of legacy code. In *Symposium on Principles of Programming Languages*, pages 128–139, 2002.
- [47] Z. Ni and Z. Shao. Certified assembly programming with embedded code pointers. In *Proc. 33rd Symp. on Principles of Prog. Lang.*, Jan. 2006.
- [48] P. W. O’Hearn and D. J. Pym. The logic of bunched implications. In *Bulletin of Symbolic Logic, Volume 5(2)*, pages 215–144, June 1999.
- [49] P. W. O’Hearn and R. D. Tennent. *Algol-Like Languages*. Birkhauser, Boston, 1997.
- [50] C. Paulin-Mohring. Inductive definitions in the system Coq—rules and properties. In M. Bezem and J. Groote, editors, *Proc. TLCA*, volume 664 of *LNCS*. Springer-Verlag, 1993.
- [51] F. Pfenning. Automated theorem proving. <http://www-2.cs.cmu.edu/~fp/courses/atp/>, Apr. 2004.
- [52] F. Pfenning and C. Elliott. Higher-order abstract syntax. In *Proc. 1988 ACM Conference on Programming Language Design and Implementation*, pages 199–208. ACM Press, 1988.
- [53] J. Reynolds. Separation logic: a logic for shared mutable data structures. In *Proc. 17th Annual IEEE Symposium on Logic in Computer Science*, 2002.
- [54] J. C. Reynolds. Separation logic: A logic for shared mutable data structures. In *Proceedings Seventeenth Annual IEEE Symposium on Logic in Computer Science*, Los Alamitos, California, 2002. IEEE Computer Society.
- [55] Z. Shao, B. Saha, V. Trifonov, and N. Papaspyrou. A type system for certified binaries. In *Proc. 29th ACM Symposium on Principles of Programming Languages*, pages 217–232. ACM Press, Jan. 2002.
- [56] G. Tan. *A Compositional Logic for Control Flow and its Application in Foundational Proof-Carrying Code*. PhD thesis, Princeton University, 2005.

- [57] D. Tarditi, G. Morrisett, P. Cheng, C. Stone, R. Harper, and P. Lee. TIL: A type-directed optimizing compiler for ML. In *Proc. 1996 ACM Conference on Programming Language Design and Implementation*, pages 181–192. ACM Press, 1996.
- [58] The Coq Development Team. The Coq proof assistant reference manual. The Coq release v8.0, Oct. 2005.
- [59] A. K. Wright and M. Felleisen. A syntactic approach to type soundness. *Information and Computation*, 115(1):38–94, 1994.
- [60] H. Xi and R. Harper. A dependently typed assembly language. In *Proc. 6th ACM SIGPLAN International Conference on Functional Programming*, pages 169–180. ACM Press, Sept. 2001.
- [61] H. Xi and F. Pfenning. Dependent types in practical programming. In *Proc. 26th ACM Symposium on Principles of Programming Languages*, pages 214–227. ACM Press, 1999.
- [62] D. Yu, N. A. Hamid, and Z. Shao. Building certified libraries for PCC: Dynamic storage allocation. In *Proc. 2003 European Symposium on Programming (ESOP'03)*, April 2003.
- [63] D. Yu, N. A. Hamid, and Z. Shao. Building certified libraries for PCC: Dynamic storage allocation. *Science of Computer Programming*, 50(1-3):101–127, Mar. 2004.
- [64] D. Yu and Z. Shao. Verification of safety properties for concurrent assembly code. In *Proc. 2004 International Conference on Functional Programming*, Sept. 2004.
- [65] Y. Yu. *Automated Proofs of Object Code For A Widely Used Microprocessor*. PhD thesis, The University of Texas at Austin, 1992.